### Types of methods to calculating the derivative
*    Numerical differentiation
*    Symbolic differentiation
*    Automatic differentiation

#### Note: 

-    symbolic differentiation leads to inefficient code (unless carefully done) and faces the difficulty of converting a computer program into a single expression


In [None]:
import torch
import torchvision
from torch import nn
from torch import optim

In [None]:
x = torch.ones(2, 2, requires_grad=True) # try with False or default
x

In [None]:
# x2= torch.ones(2, 2, requires_grad=True) # problem with * 2?

In [None]:
x.data

In [None]:
print(x.grad)

In [None]:
y = x + 2
y

In [None]:
y.grad_fn

In [None]:
z = y * y * 3

In [None]:
# z.backward(gradient=torch.ones(2,2, dtype=torch.float32))

In [None]:
out = z.mean() # try also sum
out

In [None]:
out.backward()

In [None]:
x.grad

<br>

### What is point of gradient parameter in backward method?

- The gradient arguments of a Variable's backward() method is used to calculate a weighted sum of each element of a Variable w.r.t the leaf Variable.
<br>

In [None]:
model = torchvision.models.resnet18(pretrained=True)
# If error happened run code outside ipython's kernel                                                                                                             

data = torch.rand(1, 3, 64, 64)
labels = torch.rand(1, 1000)

In [None]:
prediction = model(data)

In [None]:
loss = (prediction - labels).sum()
loss.backward() # backward pass

In [None]:
optim = torch.optim.SGD(model.parameters(), lr=1e-2, momentum=0.9)

In [None]:
optim.step() # gradient descent

In [None]:
with torch.no_grad():
    model(data)

In [None]:
for param in model.parameters():
    param.requires_grad = False
model.fc = nn.Linear(512, 10)
optimizer = optim.SGD(model.fc.parameters(), lr=1e-2, momentum=0.9)
optimizer.step()

In [None]:
a = torch.ones(2, requires_grad=True) # try with False or default

In [None]:
b = (a**2).sum()

In [None]:
c = (7*a).sum()

In [None]:
b.backward()

In [None]:
a.grad

In [None]:
c.backward()

In [None]:
a.grad

In [None]:
# Above example is why we use model.zero_grad()