## Deep Learning with PyTorch - by FreeCodeCamp

> Video link: [https://www.youtube.com/watch?v=GIsg-ZUy0MY](https://www.youtube.com/watch?v=GIsg-ZUy0MY&t=186s)

> Original channel link: [https://www.youtube.com/channel/UCEkIfTA9fTlly9bq5Hg-uzg](https://www.youtube.com/channel/UCEkIfTA9fTlly9bq5Hg-uzg)

> Source code link: [https://jovian.ml/aakashns/01-pytorch-basics](https://jovian.ml/aakashns/01-pytorch-basics)

> Copy of my notes on the topic

---

### Introduction

#### Course Contents:

<img src="./img/diag1.png"/>
<img src="./img/diag2.png"/>
<img src="./img/diag3.png"/>

### Part One: PyTorch Basics and Linear Regression

---

#### Tensors

At its core, PyTorch is a library for processing tensors. 
A tensor is a number, vector, matrix or any n-dimensional array. Let's create a tensor with a single number:

In [1]:
import torch
import jovian
import numpy as np

<IPython.core.display.Javascript object>

In [2]:
# check if gpu available
torch.cuda.is_available()

True

In [3]:
# tensor of single number
t1 = torch.tensor(data=4.)
print (t1.dtype)

torch.float32


In [4]:
# vector
t2 = torch.tensor([1., 2, 3, 4])
t2

tensor([1., 2., 3., 4.])

Note: elems in a tensor must have the same data type

In [5]:
# Matrix
t3 = torch.tensor([[5., 6], [7, 8], [9, 10]])
t3

tensor([[ 5.,  6.],
        [ 7.,  8.],
        [ 9., 10.]])

In [6]:
# 3d array - think 2 matrices encapsulated in a list
t4 = torch.tensor(
[
    [
        [11, 12, 13],
        [13, 14, 15]
    ],
    [
        [15, 16, 17],
        [17, 18, 19.]
    ]
]
)

print (t4.shape)

torch.Size([2, 2, 3])


A key diff bw a tensor and nested lists is that tensors need to have regular shapes, for eg in t4 we have 2 matrices of shape 2x3 each, we cannot have one matrix of diff shape

#### Tensor operations and gradients

We can combine tensors with the usual arithmetic operations. Let's look an example:



In [7]:
# Create tensors.
x = torch.tensor(3.)
w = torch.tensor(4., requires_grad=True)
b = torch.tensor(5., requires_grad=True)

We've created 3 tensors x, w and b, all numbers. w and b have an additional parameter requires_grad set to True. We'll see what it does in just a moment.

Let's create a new tensor y by combining these tensors:

In [8]:
# Arithmetic operations
y = w * x + b
y

tensor(17., grad_fn=<AddBackward0>)

While setting up w and b we had set `requires_grad=True`. What makes PyTorch special is that we can automatically compute the derivative of y w.r.t. the tensors that have requires_grad set to True i.e. w and b. So here we can compute dy/dw and dy/db but not dy/dx. To compute the derivatives, we can call the `.backward` method on our result y

In [9]:
## compute derivative
y.backward()

The derivates of y w.r.t the input tensors are stored in the `.grad` property of the respective tensors.


In [10]:
# Display gradients
print('dy/dx:', x.grad)
print('dy/dw:', w.grad)
print('dy/db:', b.grad)

dy/dx: None
dy/dw: tensor(3.)
dy/db: tensor(1.)


As expected, dy/dw has the same value as x i.e. 3, and dy/db has the value 1. Note that x.grad is None, because x doesn't have requires_grad set to True.

The "grad" in w.grad stands for gradient, which is another term for derivative, used mainly when dealing with matrices.

In [11]:
# jovian.commit(message='basics of tensors and gradients')

<IPython.core.display.Javascript object>

[jovian] Attempting to save notebook..
[jovian] Updating notebook "shaunaksen/deep-learning-with-pytorch-freecodecamp" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Capturing environment..


[jovian] Error: Failed to read Anaconda environment using command: "conda env export -n base --no-builds"


[jovian] Committed successfully! https://jovian.ml/shaunaksen/deep-learning-with-pytorch-freecodecamp


#### Interoperability with numpy

Instead of reinventing the wheel, PyTorch interoperates really well with Numpy to leverage its existing ecosystem of tools and libraries.

In [12]:
x = np.array([[1, 2], [3, 4.]])
x

array([[1., 2.],
       [3., 4.]])

We can convert a Numpy array to a PyTorch tensor using `torch.from_numpy`

In [13]:
y = torch.from_numpy(x) # shared memory, does not create a new copy

y_new = torch.tensor(x)

print(x, y, y_new)

[[1. 2.]
 [3. 4.]] tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64) tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)


In [15]:
x[0][1] = -100

print (x, y, y_new) # y also got modified as it was not a copy

[[   1. -100.]
 [   3.    4.]] tensor([[   1., -100.],
        [   3.,    4.]], dtype=torch.float64) tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)


We can convert a PyTorch tensor to a Numpy array using the .numpy method of a tensor.

In [16]:
z = y_new.numpy() # again shares the memory

print (type(z))

<class 'numpy.ndarray'>


The interoperability between PyTorch and Numpy is really important because most datasets you'll work with will likely be read and preprocessed as Numpy arrays.

Even after creating predictions u generally will want to convert them back to numpy arrays for further use

#### Exercises

1. What if one or more x, w or b were matrices, instead of numbers, in the above example? What would the result y and the gradients w.grad and b.grad look like in this case?

If the op of y is matrix an eror comes up saying `RuntimeError: grad can be implicitly created only for scalar outputs`



In [31]:
x = torch.tensor(
    [
        [1.],
        [3]
    ]
)

print (x.shape)

w = torch.tensor(
    [
        [3., 3]
    ], requires_grad=True
)

b = torch.tensor(5., requires_grad=True)

y = x*w

print (w.shape, x.shape)

print (y)

torch.Size([2, 1])
torch.Size([1, 2]) torch.Size([2, 1])
tensor([[3., 3.],
        [9., 9.]], grad_fn=<MulBackward0>)


In [19]:
x = torch.tensor(
    [
        [1],
        [3]
    ]
)
w = torch.tensor(4., requires_grad=True)
b = torch.tensor([
    [2., 3],
    [7, 8]
], requires_grad=True)

y = w*x + b # 1x2 . 2x1 + 1 

print (y)

tensor([[ 6., 11.],
        [19., 24.]], grad_fn=<AddBackward0>)


In [21]:
## compute derivative
y.backward()

RuntimeError: grad can be implicitly created only for scalar outputs

In [20]:
# Display gradients
print('dy/dx:', x.grad)
print('dy/dw:', w.grad)
print('dy/db:', b.grad)

dy/dx: None
dy/dw: None
dy/db: None
