### When to use Machine Learning vs. Deep Learning

**Machine Learning:**
- Tabular data
- Feature Engineering and Selection
- Few Data e.g., 20-500 
- Use of CPU
- More Explanable

**Deep Learning:**
- Image, Audio, Signals, test
- No Feature Engineering
- Large Data e.g., 10000-100000000
- Use of GPU
- Less Explanable


### FeedForward Network

In [1]:
# Practicing code (Code written by Dr. Chaklam Shilpasuwanchai)
import torch
import sys
import numpy as np

In [2]:
# Version check
torch.__version__

'2.4.1+cu118'

In [3]:
torch.manual_seed(42)

<torch._C.Generator at 0x2c4ffd10790>

In [4]:
# Checking gpu
device = torch.device("cuda:0" if(torch.cuda.is_available()) else "cpu")
print("Device:", device)

Device: cuda:0


Steps: 
1. Specify input and target
2. Dataset and DataLoader
3. nn.Linear
4. Define loss function
5. Define optimizer function
6. Train the model

In [23]:
# 1. Specifying input and target

# Input
x_train = np.array([[72, 67, 43],[91,88,64], [87,134,58],
                    [102, 43, 37], [69,96,70],[72, 67, 43],[91,88,64], [87,134,58],
                    [102, 43, 37], [69,96,70],[72, 67, 43],[91,88,64], [87,134,58],
                    [102, 43, 37], [69,96,70]
                    ], dtype='float32')

# Target
y_train = np.array([[56,70], [81,101], [119,133], [22,37], [103,119],
                    [56,70], [81,101], [119,133], [22,37], [103,119],
                    [56,70], [81,101], [119,133], [22,37], [103,119]
                    ],  dtype='float32')

**torch.from_numpy** is a Pytorch function that converts a NumPy array into a PyTorch tensor

In [24]:
inputs = torch.from_numpy(x_train)
targets = torch.from_numpy(y_train)

print(inputs.size())
print(targets.shape)

torch.Size([15, 3])
torch.Size([15, 2])


**DataLoaders** are utility classes used to effeciently manage, load, and process data in batches. 

**TensorDataset** is a simple dataset wrapper provided by the torch.utils.data module. It combines multiple tensors (typically input featurs ans outputs) into a single dataset, which can then be used with a **DataLoader** for batching, shuffling, and parallel processing

In [25]:
# 2. Dataset and DataLoader
from torch.utils.data import TensorDataset


In [26]:
# Define dataset
train_ds = TensorDataset(inputs, targets)
train_ds[0:3]

(tensor([[ 72.,  67.,  43.],
         [ 91.,  88.,  64.],
         [ 87., 134.,  58.]]),
 tensor([[ 56.,  70.],
         [ 81., 101.],
         [119., 133.]]))

In [27]:
# DataLoader
from torch.utils.data import DataLoader

In [28]:
# Define data loader
batch_size = 3
train_dl = DataLoader(train_ds, batch_size, shuffle=True)

In [29]:
for xb, yb in train_dl:
    print(xb)
    print(yb)
    break

tensor([[102.,  43.,  37.],
        [ 87., 134.,  58.],
        [ 69.,  96.,  70.]])
tensor([[ 22.,  37.],
        [119., 133.],
        [103., 119.]])


In [30]:
# Define some layer - nn.Linear

import torch.nn as nn

# Define model
model = nn.Linear(3,2)
print(model.weight)
print(model.weight.size())
print(model.bias)
print(model.bias.size())

Parameter containing:
tensor([[-0.4557, -0.2662, -0.1630],
        [-0.3471,  0.0545, -0.5702]], requires_grad=True)
torch.Size([2, 3])
Parameter containing:
tensor([ 0.5214, -0.4904], requires_grad=True)
torch.Size([2])


In [31]:
# Params
list(model.parameters())

[Parameter containing:
 tensor([[-0.4557, -0.2662, -0.1630],
         [-0.3471,  0.0545, -0.5702]], requires_grad=True),
 Parameter containing:
 tensor([ 0.5214, -0.4904], requires_grad=True)]

**model.parameter():** returns an iterator over all the parameters(weigths and biases) of the model. Each layer in a nn has assocoated parameters, such as weights and biases.

**p.requires_grad:** The condition checks if the parameter p requires gradient. If it does, it means that the parameter is trainable and will be updated during the training process. Some parameters, such as frozen layers, might have requires_grad = False to exclude them from the training process.

**p.numel():** Returns the total number of parameters in that parameter tensir.

In [32]:
# complexity by the number of parameters
print(sum(p.numel() for p in model.parameters() if p.requires_grad))

8


In [33]:
# Generate predictions
pred = model(inputs)
pred

tensor([[-57.1354, -46.3541],
        [-74.8075, -63.7805],
        [-84.2500, -56.4638],
        [-63.4415, -54.6548],
        [-67.8887, -59.1288],
        [-57.1354, -46.3541],
        [-74.8075, -63.7805],
        [-84.2500, -56.4638],
        [-63.4415, -54.6548],
        [-67.8887, -59.1288],
        [-57.1354, -46.3541],
        [-74.8075, -63.7805],
        [-84.2500, -56.4638],
        [-63.4415, -54.6548],
        [-67.8887, -59.1288]], grad_fn=<AddmmBackward0>)

In [34]:
# Define Loss Function
criterion_mse = nn.MSELoss()
criterion_softmax_cross_entropy_loss = nn.CrossEntropyLoss()

In [35]:
mse = criterion_mse(pred, targets)
print(mse)
print(mse.item())

tensor(23160.7246, grad_fn=<MseLossBackward0>)
23160.724609375


In [36]:
# Define the optimizer
opt = torch.optim.SGD(model.parameters(), lr=0.0001, momentum=0.9)

In [37]:
# Training

def fit(num_epochs, model, loss_fn, opt, train_dl):
    for epoch in range(num_epochs):

        # Train with batches of data
        for xb, yb in train_dl:
            xb.to(device)
            yb.to(device)

            # Predict
            pred = model(xb)

            # Caluclate loss
            loss = loss_fn(pred, yb)

            # Caluclate gradient
            opt.zero_grad()
            loss.backward()

            # Update params
            opt.step()
        
        if(epoch+1)%10==0:
            sys.stdout.write("\rEpoch[{}/{}], Loss: {:.4f}".format(epoch+1, num_epochs, loss.item()))

In [38]:
fit(100, model, criterion_mse, opt, train_dl)

Epoch[100/100], Loss: 19.3841

In [39]:
pred = model(inputs)
loss = criterion_mse(pred, targets)
print(loss.item())

21.766687393188477
