A basic pytorch tutorial from [here](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)

In [64]:
import torch
import numpy as np

initialize a tensor from a data. e.g. 2-d array

In [65]:
data = [[1,2], [3, 4]]
x_data = torch.tensor(data=data)

Initialize a tensor from a numpy array.

In [66]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)

Initialize a tensor from another tensor. The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.

In [67]:
x_ones = torch.ones_like(x_data) # retrin the properties of x_data
print(f"ones tensor: {x_ones}\n")

x_rand = torch.rand_like(x_data, dtype=torch.float) # override the datatype of x_data
print(f"random tensor: {x_rand}\n")

ones tensor: tensor([[1, 1],
        [1, 1]])

random tensor: tensor([[0.0901, 0.4822],
        [0.9089, 0.0780]])



with a random or constant value. `shape` is a tuple of tensor dimention. In the functions below, ir determines the dimensionality of the output tensor.

In [68]:
shape = (2, 3)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"random tensor: {rand_tensor}")
print(f"ones tensor: {ones_tensor}")
print(f"zeros tensor: {zeros_tensor}")

random tensor: tensor([[0.4272, 0.0437, 0.8781],
        [0.4689, 0.9854, 0.6647]])
ones tensor: tensor([[1., 1., 1.],
        [1., 1., 1.]])
zeros tensor: tensor([[0., 0., 0.],
        [0., 0., 0.]])


tensor also have it's own attributes, it desctibes their shape, datatype and the device on which they are stored.

In [69]:
tensor = torch.rand(3, 4)
print(f"random tensor: {tensor}")
print(f"shape of tensor: {tensor.shape}")
print(f"data type of tensor: {tensor.dtype}")
print(f"device tensor is stored on: {tensor.device}")

random tensor: tensor([[0.2671, 0.6848, 0.9897, 0.4451],
        [0.0690, 0.9546, 0.5507, 0.0236],
        [0.6751, 0.8684, 0.4806, 0.7136]])
shape of tensor: torch.Size([3, 4])
data type of tensor: torch.float32
device tensor is stored on: cpu


tensor operations. tensors support multiple operations, including transposing, indexing, slicing, ...

In [70]:
if torch.backends.mps.is_available():
    tensor = tensor.to('mps')
    print(f"device tensor is stored on {tensor.device}")

device tensor is stored on mps:0


In [71]:
# tensor support slicing and indexing
tensor = torch.ones(4, 4)
print(tensor)
tensor[:, 1] = 0
print(tensor)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


In [72]:
# we can also concate multiple tensors

t1 = torch.cat([tensor, tensor, tensor, tensor])
print(t1)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


In [73]:
# this compute the element-wise product
print(f"tensor.mul(tensor) {tensor.mul(tensor)}")
# this is alternative syntax
print(f"tensor * tensor {tensor * tensor}")

tensor.mul(tensor) tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])
tensor * tensor tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


`in-place operation` optnerations that have a `_` suffix are `in-place`. For example, `x.copy_(y)`, `x.t_()`, will change `x`.

In [74]:
print(tensor)
tensor.add_(5)
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])
tensor([[6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.]])


### Bridge with numpy
tensors on the cpu and numpy arrays can share their underlaying memory locations, and changing one will change the other.

In [75]:
t = torch.ones(5)
print(f"t: {t}")
n = t.numpy()
print(f"n: {n}")

t: tensor([1., 1., 1., 1., 1.])
n: [1. 1. 1. 1. 1.]


In [76]:
t.add_(1)
# reflect to the t and n instance.
print(f"t: {t}")
print(f"n: {n}")

t: tensor([2., 2., 2., 2., 2.])
n: [2. 2. 2. 2. 2.]


In [77]:
np.add(n, 1, out=n)
print(f"t: {t}")
print(f"n: {n}")

t: tensor([3., 3., 3., 3., 3.])
n: [3. 3. 3. 3. 3.]


# A gentle introduction to `torch.autograd`
neuron networks are collection of nested functions that are executed on some input data. These functions are defined by `parameters` (consisting of weight and biases) which pytorch are stored in tensors.

training a NN happens in two steps:

## forward propagation

in forward pp, the NN makes it's best guess about the correct output. it runs the input data through each of it's functions to make this guess.

## backward propagation

in back pp, the NN adjust it's parameters proportionate to the error in it's guess. It does this by traversing backwards from the output, collecting the derivatives of the error with respect to the parameters of the functions (`gradients`) and optimizing the parameters using `gradient descent`.

### Start from an example
we load a pretrained resnet18 model from `torchvision`. We create a random data tensor to represent a simple image with 3 channels, and height & width of 64 and it's coresponding `label` initialized to some random values. Labels in pretrained models has shape (1, 1000)

In [78]:
import torch
from torchvision.models import resnet18, ResNet18_Weights
model = resnet18(weights=ResNet18_Weights.DEFAULT)
# data is represent as a 3 channel 64x64 image, and each channel'value type is float that from 0 to 1
data = torch.rand(1, 3, 64, 64)
labels = torch.rand(1, 1000)

In [79]:
print(f"data dimention is {data.shape}")
print(f"data datatype is {data.dtype}")
print(f"label dimention is {labels.shape}")
print(f"lable  datatype is {labels.dtype}")

data dimention is torch.Size([1, 3, 64, 64])
data datatype is torch.float32
label dimention is torch.Size([1, 1000])
lable  datatype is torch.float32


since we have the resnet18 model with a default weight. Now we run `forward pass`:


In [80]:
prediction = model(data)
print(f"prediction is {prediction}")

prediction is tensor([[-5.9637e-01, -4.0663e-01, -6.1672e-01, -1.6282e+00, -8.0112e-01,
         -3.7188e-01, -6.1246e-01,  5.9014e-01,  4.3512e-01, -8.1981e-01,
         -9.9065e-01, -5.0543e-01, -2.0572e-01, -7.6218e-01, -1.0108e+00,
         -7.1315e-01, -7.6979e-01, -8.0279e-02, -3.8696e-01, -6.8331e-01,
         -1.3993e+00, -6.6493e-01, -1.3539e+00,  1.9825e-01, -7.8628e-01,
         -1.1046e+00, -6.1931e-01, -1.0785e+00, -7.9773e-01, -2.8110e-01,
         -8.3119e-01, -9.6465e-01, -6.4080e-01, -7.5072e-01, -4.1383e-01,
         -5.7549e-01,  5.1857e-01, -7.9591e-01, -4.1426e-01,  3.0670e-02,
         -7.4567e-01, -8.4535e-01, -1.0504e+00, -4.6258e-01, -5.7252e-01,
         -3.8199e-01, -7.2758e-01, -4.2273e-01, -1.2770e+00, -1.3260e+00,
         -5.4800e-01,  2.5394e-01, -1.7533e-01, -4.0595e-01, -4.6777e-02,
         -8.8195e-01, -1.3289e-01, -1.1557e+00, -1.9127e-01, -2.5418e-01,
          7.8824e-01,  4.8648e-02, -9.1774e-02,  3.9039e-01, -4.9240e-01,
          9.2045e-02, -1

now we use the model's prediction and the coresponding labels to calculate the error (`loss`). The next step is to `backpropagate` this error throught the network.


In [81]:
loss = (prediction - labels).sum()
print(f"loss is {loss}")

loss is -499.898681640625



now backpropagate the error (`loss`) throught the network using the method `backward`.
This method `autograde` and stored the gradients for each model parameter in the parameters

In [82]:
loss.backward()

next we load an optimizer, in this case SGD with a learning rate of 0.01 and momentum of 0.9. We register all the prameters of the model in this optimizer.




In [83]:
optim = torch.optim.SGD(model.parameters(), lr=1e-2, momentum=0.9)

finally, we call `.step()` to initiate gradient descent . The optimizer adjusts each parameter by it's gradient stored in `.grad`.

In [84]:
optim.step()

At this point, you have every thing you need to train your neuron network. 