<a href="https://colab.research.google.com/github/Paul-mwaura/Programming-PyTorch-for-Deep-Learning/blob/main/Introduction_to_Pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Introduction to Pytorch

### Importing the Library

In [2]:
import torch

In [3]:
print(torch.cuda.is_available())

True


> `torch.cuda.is_available()` should return "True"

> If it evaluates to "False", change your colab runtime to GPU, then run the code from top.

In [5]:
print(torch.rand(2,2)) 

tensor([[0.2249, 0.3785],
        [0.6265, 0.4632]])


### Tensors 
A tensor is both a container for numbers as well as a set of rules that define transformations between tensors that produce new tensors.

Think about tensors as multidimensional arrays

In [6]:
x = torch.tensor([[0,0,1],[1,1,1],[0,0,0]]) 
x

tensor([[0, 0, 1],
        [1, 1, 1],
        [0, 0, 0]])

Change an element in a tensor by using standard Python indexing

In [9]:
x[0][0] = 7
x[1][1] = 7
x[2][2] = 7

x

tensor([[7, 0, 1],
        [1, 7, 1],
        [0, 0, 7]])

You can use special creation functions to generate particular types of tensors. In particular, ones() and zeroes() will generate tensors filled with 1s and 0s, respectively

In [10]:
torch.zeros(2,2)

tensor([[0., 0.],
        [0., 0.]])

In [11]:
torch.ones(4,4)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

Standard mathematical operations with tensors (e.g., adding two tensors together)

In [13]:
torch.ones(1,2) + torch.ones(1,2)

tensor([[2., 2.]])

If you have a tensor of rank 0, you can pull out the value with item()

In [17]:
torch.rand(1).item()

0.5123191475868225

Tensors can live in the CPU or on the GPU and can be copied between devices by using the to() function.

In [23]:
cpu_tensor = torch.rand(2,2) 
print(cpu_tensor.device)
print(cpu_tensor)

cpu
tensor([[0.0041, 0.3583],
        [0.2925, 0.6987]])


In [24]:
gpu_tensor = cpu_tensor.to("cuda") 
print(gpu_tensor.device)
print(gpu_tensor) 

cuda:0
tensor([[0.0041, 0.3583],
        [0.2925, 0.6987]], device='cuda:0')


### Tensor Operations

We often need to find the maximum item in a tensor as well as the index that contains the maximum value.

In [25]:
torch.rand(2,2).max()

tensor(0.5002)

In [26]:
torch.rand(2,2).max().item()

0.9984431266784668

Sometimes, we’d like to change the type of a tensor; for example, from a LongTensor to a FloatTensor. We can do this with to():


In [27]:
long_tensor = torch.tensor([[0,0,1],[1,1,1],[0,0,0]])
long_tensor.type()

'torch.LongTensor'

In [28]:
float_tensor = torch.tensor([[0,0,1],[1,1,1],[0,0,0]]).to(dtype=torch.float32) 
float_tensor.type() 

'torch.FloatTensor'

In [29]:
float_tensor = torch.tensor(long_tensor).to(dtype=torch.float32)
float_tensor.type()

  """Entry point for launching an IPython kernel.


'torch.FloatTensor'

If you want to save memory, look to see if an in-place function is defined, which should be the same name as the original function but with an appended underscore (_).


In [30]:
random_tensor = torch.rand(2,2) 
random_tensor.log2()

tensor([[-6.1251, -0.4932],
        [-0.1851, -2.7723]])

In [31]:
random_tensor.log2_()

tensor([[-6.1251, -0.4932],
        [-0.1851, -2.7723]])

Another common operation is reshaping a tensor. This can often occur because your neural network layer may require a slightly different input shape than what you currently have to feed into it.

In [35]:
flat_tensor = torch.rand(784) 
viewed_tensor = flat_tensor.view(1,28,28) 
print(f"Flat tensor: {flat_tensor[:15,]}") # let's print the first 15 values.
print(f"Viewed tensor :{viewed_tensor.shape}") 

Flat tensor: tensor([0.9707, 0.4345, 0.2709, 0.8198, 0.2541, 0.7674, 0.9412, 0.4782, 0.3368,
        0.1648, 0.9091, 0.5670, 0.8138, 0.4993, 0.8596])
Viewed tensor :torch.Size([1, 28, 28])


In [36]:
reshaped_tensor = flat_tensor.reshape(1,28,28) 
reshaped_tensor.shape 

torch.Size([1, 28, 28])

> Note that the reshaped tensor’s shape has to have the same number of total elements as the original. If you try flat_tensor.reshape(3,28,28), you’ll see an error like this:
```
RuntimeError Traceback (most recent call last) <ipython-input-26-774c70ba5c08> in <module>() 
----> 1 flat_tensor.reshape(3,28,28)
RuntimeError: shape '[3, 28, 28]' is invalid for input of size 784
```
The answer is that view() operates as a view on the original tensor, so if the underlying data is changed, the view will change too (and vice versa). However, view() can throw errors if the required view is not contiguous; that is, it doesn’t share the same block of memory it would occupy if a new tensor of the required shape was created from scratch. If this happens, you have to call tensor.contiguous() before you can use view(). However, reshape() does all that behind the scenes, so in general, I recommend using reshape() rather than view().



> Finally, you might need to rearrange the dimensions of a tensor. You will likely come across this with images, which often are stored as [height, width, channel] tensors, but PyTorch prefers to deal with these in a [channel, height, width]. You can user permute() to deal with these in a fairly straightforward manner:


In [37]:
hwc_tensor = torch.rand(640, 480, 3) 
chw_tensor = hwc_tensor.permute(2,0,1) 
chw_tensor.shape

torch.Size([3, 640, 480])

Here, we’ve just applied permute to a [640,480,3] tensor, with the arguments being the indexes of the tensor’s dimensions, so we want the final dimension (2, due to zero indexing) to be at the front of our tensor, followed by the remaining two dimensions in their original order.

### Tensor Broadcasting 

Borrowed from NumPy, broadcasting allows you to perform operations between a tensor and a smaller tensor. You can broadcast across two tensors if, starting backward from their trailing dimensions: • The two dimensions are equal. • One of the dimensions is 1. In our use of broadcasting, it works because 1 has a dimension of 1, and as there are no other dimensions, the 1 can be expanded to cover the other tensor. If we tried to add a [2,2] tensor to a [3,3] tensor, we’d get this error message:

`The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 1 `

But we could add a [1,3] tensor to the [3,3] tensor without any trouble. Broadcasting is a handy little feature that increases brevity of code, and is often faster than manually expanding the tensor yourself.