<a href="https://colab.research.google.com/github/DavoodSZ1993/Dive-into-Deep-Learning-Notes-/blob/main/6_builderguide_notes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## PyTorch Notes

* `torch.rand()`: Returns a tensor filled with random numbers from a uniform distribution on the interval [0, 1).
* `@`: A@B operator internally maps to `torch.matmul(A,B)`.

In [11]:
import torch

X = torch.rand(size=(2, 2))
X

tensor([[0.5216, 0.1485],
        [0.3893, 0.0416]])

In [12]:
# @ 
A = torch.randn(1, 64, 1152, 1, 8)
B = torch.randn(10, 1, 1152, 8, 16)

# The nmatrix multiplication is done between last two dimensions (1x8 @ 8x16 --> 1x16)
#The remaining first three dimensions are broadcast and are batch.
C = A @ B

A.shape, B.shape, C.shape

(torch.Size([1, 64, 1152, 1, 8]),
 torch.Size([10, 1, 1152, 8, 16]),
 torch.Size([10, 64, 1152, 1, 16]))

* `add_module()` method in `Module` class: adds a child module to the current module. This method is useful when adding modules using the `for` loop.
* `children()` method in `Module` class: Returns an iterator over *immediate children* modules. This method is a generator that returns layers of the model from which you can extract parameter tensors using `layername.wieght` and `layername.bias`.

In [13]:
# add_module
from torch import nn

X = torch.rand(2, 20)

modules1 = {'linear1': nn.LazyLinear(256),
           'actv1': nn.ReLU(),
           'linear2': nn.LazyLinear(10)}

class Net(nn.Module):
  def __init__(self, **kwargs):
    super().__init__()

    for key, value in kwargs.items():
      self.add_module(key, value)

  def forward(self, X):
    for module in self.children():         # returns the modules of a neural network ,modules() will not work here because it considers all the modules (the net plus three others)
      X = module(X)
    return X

net = Net(**modules1)

for module in net.modules():
  print(module)
  break          


Net(
  (linear1): LazyLinear(in_features=0, out_features=256, bias=True)
  (actv1): ReLU()
  (linear2): LazyLinear(in_features=0, out_features=10, bias=True)
)




* `modules()` method in `Module` class: Returns an iterator over *all modules* in the network. If we want to recursively iterate over modules, then we should use `modules()` method instead of `children()` method.

In [14]:
# Difference between children() and modules()

net2 = nn.Sequential(nn.Linear(2,2),
                     nn.ReLU(),
                     nn.Sequential(nn.Sigmoid(),
                                   nn.ReLU()))

In [15]:
for module in net2.children():
  print(module)

Linear(in_features=2, out_features=2, bias=True)
ReLU()
Sequential(
  (0): Sigmoid()
  (1): ReLU()
)


In [16]:
for module in net2.modules():
  print(module)

Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): ReLU()
  (2): Sequential(
    (0): Sigmoid()
    (1): ReLU()
  )
)
Linear(in_features=2, out_features=2, bias=True)
ReLU()
Sequential(
  (0): Sigmoid()
  (1): ReLU()
)
Sigmoid()
ReLU()


* `net.state_dict()`: In PyTorch, the learnable parameters (weights and biases) of a `torch.nn.Module` model are contained in the model parameters (accessed with `model.parameters()`). A `state-dict()` is simply a Python dictionary object that maps each layer to its parameter tensor.

In [17]:
net3 = nn.Sequential(nn.Linear(2,2),
                     nn.ReLU(),
                     nn.Sequential(nn.Sigmoid(),
                                   nn.ReLU()))

In [19]:
for param in net3.parameters():
  print(type(param.data), param.size())

<class 'torch.Tensor'> torch.Size([2, 2])
<class 'torch.Tensor'> torch.Size([2])


In [20]:
net3.state_dict()

OrderedDict([('0.weight', tensor([[-0.4838, -0.1259],
                      [-0.0970, -0.0682]])),
             ('0.bias', tensor([-0.2935, -0.2795]))])

* `named_parameters()` method in `nn.Module` class: Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself. 

In [21]:
for name, param in net3.named_parameters():
  print(name, param)

0.weight Parameter containing:
tensor([[-0.4838, -0.1259],
        [-0.0970, -0.0682]], requires_grad=True)
0.bias Parameter containing:
tensor([-0.2935, -0.2795], requires_grad=True)


## General Notes

### \*args and \**kwargs

* \*args: **Non-Keyword Arguments**: This is used in function definition is python and is used to pass a number of arguments to a function. 
* \*\*kwargs: **Keyword Arguments**: Is used in function definition in Python and is used to pass a *keyworded*, variable-length argument list.

* A keyword argument is where you provide a name to the variable as you pass it into the function. kwargs can be think of as *dictionary*.

In [None]:
# *args
def myfunc(*args):
  for arg in args:
    print(arg)

args = ('Davood', 'Ahmad', 'Akbar', 'Mohsen')
myfunc(*args)

Davood
Ahmad
Akbar
Mohsen


In [None]:
# **kwargs

def myfunc1(**kwargs):
  for key, value in kwargs.items():         # Items() method returns the key and value in a dictionary.
    print(f'{key} == {value}')

myfunc1(name='Davood', age=29, education='M.Sc.')

name == Davood
age == 29
education == M.Sc.
