# PyTorch notebook: The nitty-gritty

I hope to one day make this into a lovely notebook that gets into the nitty-gritty of Pytorch

In [2]:
import torch
import torch.nn as nn

### Layers

**Linear Layers**

* `nn.Linear(input_features, output_features)` performs the transformation:
    - $Y = X W^{T} + b $, where:
    - $Y$ is (batch_size, output_features), here output_features > 1, if we are trying to predict >1 outcome.
    - $X$ is (batch_size, input features)
    - $W^{T}$ is (input_features, output_features)
* `nn.Linear(...)` returns $Y$.
* When you return the weight matrix of `nn.Linear(...)` it's shape is (output_features, input_features)

In [13]:
batch_size = 3
in_features = 5
out_features = 3

linear = nn.Linear(in_features, out_features)

X = torch.rand(batch_size, in_features)

Y = linear(X)

print("X...")
print(X)
print(X.shape)
print("")

print("Y...")
print(Y)
print(Y.shape)
print("")

print("Linear Weight...")
print(linear.weight)
print(linear.weight.shape)

X...
tensor([[0.4225, 0.7761, 0.2352, 0.2483, 0.6939],
        [0.1330, 0.7882, 0.9282, 0.3806, 0.3382],
        [0.8689, 0.3895, 0.1511, 0.5371, 0.0823]])
torch.Size([3, 5])

Y...
tensor([[-0.4052,  0.4220, -0.2172],
        [-0.1551,  0.7601, -0.3398],
        [-0.2747, -0.0779, -0.4220]], grad_fn=<AddmmBackward0>)
torch.Size([3, 3])

Linear Weight...
Parameter containing:
tensor([[-0.1004, -0.2112,  0.2376,  0.2016, -0.0908],
        [-0.4288,  0.4420,  0.3763, -0.2624,  0.0492],
        [-0.4021, -0.0935, -0.2856, -0.2747,  0.0100]], requires_grad=True)
torch.Size([3, 5])


**RNN layer**


In [25]:
input_size = 3
seq_length = 5
batch_size = 1

rnn = nn.RNN(input_size=input_size, hidden_size=3)

X = torch.rand(seq_length, batch_size, input_size)
print(f"X shape: {X.shape}")

out, hh = rnn(X)
print(f"Output shape: {out.shape}")
print(f"Hidden shape: {hh.shape}")
print("hh outputs the hidden shape ONLY for the final time step")
print(hh)
print("")
print("hh outputs the hidden shape for all time steps")
print(out)


X shape: torch.Size([5, 1, 3])
Output shape: torch.Size([5, 1, 3])
Hidden shape: torch.Size([1, 1, 3])
hh outputs the hidden shape ONLY for the final time step
tensor([[[-0.1781,  0.9003,  0.6967]]], grad_fn=<StackBackward0>)

hh outputs the hidden shape for all time steps
tensor([[[ 0.4643,  0.8609,  0.6027]],

        [[-0.0283,  0.9431,  0.6206]],

        [[-0.1864,  0.9239,  0.6808]],

        [[-0.0794,  0.9508,  0.7809]],

        [[-0.1781,  0.9003,  0.6967]]], grad_fn=<StackBackward0>)
