## 1. Tensors (Foundation)
Everything in Pytorch is a Tensor. It is like a Numpy array, but it tracks gradients and it CAN runs on GCP

`Key APIs: torch.tensor, torch.cuda.is_available(), torch.ran, .to('cuda'), .shap`

In [2]:
import torch
from IPython.testing.tools import fake_input

# 1. Create a tensor (Matrix)
x = torch.tensor([[1., 2., 3.], [4., 5., 6.]])

# 2. Operation (element-wise)
y = x + 5

# 3. GPU support (it is crucial for Deep Learning)
device = 'cuda' if torch.cuda.is_available() else 'cpu'
x = x.to(device)

print(f"Shape: {x.shape}")
print(f"Device: {device}")
print(f"Type: {type(x)}")
print(f"Value: {x}")

Shape: torch.Size([2, 3])
Device: cpu
Type: <class 'torch.Tensor'>
Value: tensor([[1., 2., 3.],
        [4., 5., 6.]])


## 2 Autograd (The Magic)
This is what makes training possible. PyTorch records every operation you do on a tensor so it can calculate gradients (derivatives) automatically latter

`Key APIs: requires_grad=True, .backward(), .grad`

In [8]:
We define a Neuron Network by subclassing nn.Module import torch

# Create a tensor and tell PyTorch: "Watch this variable for math"
x = torch.tensor([2.], requires_grad=True)
z = torch.tensor([3.], requires_grad=True)

# Define a formula
y = x ** 3 + z

# Calculate gradient (dy/dx = 3 * x^2)
y.backward()

# Check gradient at x = 2
# Expect 3 * 2**2 = 12
print(x.grad)
print(z.grad)

tensor([12.])
tensor([1.])


## 3. Neuron Network Architecture
We define a Neuron Network by subclassing `nn.Module`. We act as an architect: define the layer `__init__` and connect them in `forward`

`Key APIs: nn.Linear, nn.ReLU, nn.Sequential`

In [16]:
import torch
import torch.nn as nn

class SimpleNet(nn.Module):
    def __init__(self):
        super(SimpleNet, self).__init__()

        # Define layers: Input 10 -> Hidden 5 -> Output 1
        self.layer1 = nn.Linear(10, 5)
        self.activation = nn.ReLU() # -> Activation functio
        self.layer2 = nn.Linear(5, 1)

    def forward(self, x):
        x = self.layer1(x)
        x = self.activation(x)
        x = self.layer2(x)
        return x

# Initialize model
model = SimpleNet()
print(model)

# Test model
fake_input = torch.randn(5, 10)
print(f"Input: {fake_input}")
output = model(fake_input)
print(f"Output: {output}")

SimpleNet(
  (layer1): Linear(in_features=10, out_features=5, bias=True)
  (activation): ReLU()
  (layer2): Linear(in_features=5, out_features=1, bias=True)
)
Input: tensor([[ 1.2026e+00,  5.2272e-02,  4.1974e-01,  1.8325e-01, -1.1257e+00,
          1.3734e+00, -1.3053e+00,  6.9239e-01, -5.1955e-02,  5.1084e-01],
        [-8.9762e-01,  5.2094e-01, -6.3102e-01,  8.5265e-03, -4.6980e-01,
         -9.8913e-01,  2.6267e-01, -9.3333e-01, -1.0063e+00, -2.4868e-01],
        [ 3.0129e-01, -3.2402e-01, -1.0102e-01, -2.3988e-01,  9.2713e-02,
          4.3243e-01,  7.7260e-01,  2.0692e+00,  1.0886e+00,  6.9122e-01],
        [-2.0511e-01, -1.9477e+00, -8.4685e-01, -1.0067e+00,  1.3272e-01,
         -5.0815e-01, -3.5361e-02,  9.4539e-01, -9.2513e-01, -1.4834e+00],
        [-1.4140e-04, -7.2409e-02,  7.1898e-01, -8.8838e-01, -3.5795e-01,
         -1.0386e-02,  2.5021e+00,  3.4118e-02,  3.8721e-01, -4.8204e-01]])
Output: tensor([[-0.0701],
        [-0.3270],
        [ 0.0333],
        [-0.3151],
    

## 4. Data Loader
In deep learning activities we cannot feed all data at once. We need to batch and shuffle it. PyTorch handles this with `Dataset` and `DataLoader`

`Key APIs: TensorDataset, DataLoader`

In [17]:
import torch
from torch.utils.data import DataLoader, TensorDataset

# Create a random dataset input with 100 rows and 10 features
inputs = torch.rand(100, 10)

# This is prediction problem so we create a target
targets = torch.rand(100, 1)


# Wrap two tensor into a dataset
dataset = TensorDataset(inputs, targets)

# Create a dataloader
loader = DataLoader(dataset, batch_size=10, shuffle=True)

# Iterate through batch
for i, t in loader:
    print(i, t)

tensor([[0.2484, 0.2784, 0.6148, 0.9052, 0.4380, 0.3373, 0.8568, 0.5252, 0.3750,
         0.5148],
        [0.0358, 0.1757, 0.3682, 0.8978, 0.9413, 0.1671, 0.6168, 0.3301, 0.3467,
         0.5585],
        [0.9637, 0.2139, 0.2334, 0.9150, 0.6472, 0.7433, 0.2992, 0.9609, 0.8104,
         0.7002],
        [0.5064, 0.9473, 0.6233, 0.3082, 0.1652, 0.3140, 0.8956, 0.1440, 0.1817,
         0.2501],
        [0.9143, 0.3909, 0.1028, 0.3601, 0.2251, 0.2723, 0.9635, 0.6837, 0.9163,
         0.1129],
        [0.5411, 0.0427, 0.9001, 0.2913, 0.7177, 0.1227, 0.1211, 0.2383, 0.8062,
         0.4001],
        [0.8808, 0.4073, 0.8589, 0.6497, 0.8883, 0.5018, 0.8811, 0.0127, 0.3697,
         0.9813],
        [0.5136, 0.5114, 0.4034, 0.2982, 0.9921, 0.1448, 0.7652, 0.7691, 0.3880,
         0.9613],
        [0.4811, 0.6001, 0.1528, 0.6329, 0.9722, 0.7418, 0.0129, 0.1398, 0.6255,
         0.6982],
        [0.0382, 0.0333, 0.4953, 0.6396, 0.6804, 0.1427, 0.8496, 0.8897, 0.8097,
         0.4361]]) tensor([[

## 5. The Training Loop (Putting it all together)
This is a standard recipe used in 99% of PyTorch projects. We will teach a network to learn the formula y = 2x (Linear Regression)

The Recipe:
1. Forward pass: compute prediction
2. Loss calculation: how wrong prediction is
3. Zero grad: Reset gradient - remove old gradient
4. Re-calculate gradient for all neron
5. Step: Update weights

In [54]:
import torch
import torch.nn as nn

# The function we want to calculate
# y = 2 * x

class SimpleNet(nn.Module):
    def __init__(self):
        super(SimpleNet, self).__init__()
        self.layer1 = nn.Linear(1, 1)

    def forward(self, x):
        x = self.layer1(x)
        return x

# Use Mean Square Error Loss -> This will penalty a big loss, avoid with data with outliner
criterion = nn.MSELoss()
model = SimpleNet()

optimizer = torch.optim.SGD(model.parameters(), lr=0.00001)

inputs = torch.randint(low=0, high=100, size=(100, 1)) * 1.0
targets = 2. * inputs

epoch = 10000
for t in range(epoch):
    # Do prediction
    prediction = model(inputs.float())

    # Calculate error
    loss = criterion(prediction, targets)

    # Reset gradient
    optimizer.zero_grad()

    # Calculate gradient
    loss.backward()

    # Update network
    optimizer.step()

    print(f"Loss: {loss}")


unseen = torch.tensor([2.])
prediction = model(unseen)
prediction
# assert prediction == unseen * 2

Loss: 15627.7978515625
Loss: 13403.93359375
Loss: 11496.5322265625
Loss: 9860.5615234375
Loss: 8457.3935546875
Loss: 7253.89990234375
Loss: 6221.66748046875
Loss: 5336.3251953125
Loss: 4576.96875
Loss: 3925.671875
Loss: 3367.05615234375
Loss: 2887.933349609375
Loss: 2476.99169921875
Loss: 2124.528076171875
Loss: 1822.2210693359375
Loss: 1562.9334716796875
Loss: 1340.54296875
Loss: 1149.79931640625
Loss: 986.19921875
Loss: 845.8796997070312
Loss: 725.5283813476562
Loss: 622.3034057617188
Loss: 533.767578125
Loss: 457.8306884765625
Loss: 392.6999206542969
Loss: 336.8372802734375
Loss: 288.924072265625
Loss: 247.82908630371094
Loss: 212.58218383789062
Loss: 182.3509521484375
Loss: 156.42153930664062
Loss: 134.18212890625
Loss: 115.10757446289062
Loss: 98.74724578857422
Loss: 84.71519470214844
Loss: 72.6797866821289
Loss: 62.357078552246094
Loss: 53.50334167480469
Loss: 45.90962219238281
Loss: 39.396339416503906
Loss: 33.80997848510742
Loss: 29.018512725830078
Loss: 24.908952713012695
Loss

tensor([4.8133], grad_fn=<ViewBackward0>)

In [52]:
targets

tensor([[172.],
        [ 84.],
        [178.],
        [ 84.],
        [126.],
        [ 18.],
        [ 56.],
        [112.],
        [108.],
        [ 86.],
        [168.],
        [  6.],
        [ 32.],
        [116.],
        [104.],
        [ 86.],
        [144.],
        [ 82.],
        [114.],
        [140.],
        [132.],
        [130.],
        [156.],
        [188.],
        [ 34.],
        [188.],
        [ 18.],
        [170.],
        [170.],
        [ 64.],
        [ 94.],
        [ 26.],
        [106.],
        [162.],
        [ 72.],
        [ 28.],
        [ 34.],
        [ 76.],
        [  8.],
        [130.],
        [ 10.],
        [  8.],
        [ 10.],
        [136.],
        [124.],
        [170.],
        [156.],
        [ 98.],
        [128.],
        [ 48.],
        [ 54.],
        [  6.],
        [188.],
        [142.],
        [184.],
        [126.],
        [124.],
        [ 28.],
        [166.],
        [  4.],
        [ 80.],
        [158.],
        

In [44]:
inputs

tensor([[ 0.3767],
        [ 1.9490],
        [ 1.2904],
        [-0.0161],
        [ 0.0742],
        [-0.4900],
        [ 1.1154],
        [ 0.5543],
        [-0.3042],
        [ 0.1018],
        [-0.4432],
        [-0.1793],
        [-0.9141],
        [ 1.0854],
        [-0.9336],
        [-1.5184],
        [ 0.7422],
        [ 1.9173],
        [ 0.1550],
        [ 0.4583],
        [ 0.6486],
        [-1.1683],
        [-0.4724],
        [ 0.0696],
        [-0.1243],
        [-0.1694],
        [-0.5938],
        [-0.2613],
        [-0.6700],
        [ 0.4131],
        [ 0.7714],
        [ 0.3078],
        [-1.1005],
        [-0.3186],
        [-2.0531],
        [ 0.0077],
        [-2.0871],
        [-0.0179],
        [ 1.1261],
        [ 1.2403],
        [ 0.9590],
        [-1.0226],
        [ 1.4456],
        [ 0.5834],
        [ 0.9053],
        [-0.6100],
        [ 0.0598],
        [ 0.3530],
        [-0.2915],
        [-0.5164],
        [ 0.7086],
        [ 0.5621],
        [-0.