##### Article list
* machine learning / ai
* how machine learning works
* machine learning algorithms
    * Supervised -> linear / logistic regression, svm, Decision trees, Random forests, k-NN, Neural networks, Naive bayes - explain with 1/2 problems
    * Unsupervised -> K-Means Clustering, Hierarchical Clustering, Principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), Gaussian Mixture Models (GMM), Autoencoders, Self-Organizing Maps (SOM), DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
    * Semi-Supervised -> Self-training, Co-training, Generative Models (Generative Adversarial Networks)
    * Reinforcement -> Q-Learning, Deep Q-Networks (DQN), Policy Gradient Methods, Actor-Critic Methods, Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO)
    * Ensemble -> Bagging (Bootstrap Aggregating), Boosting (e.g., AdaBoost, Gradient Boosting), Stacking, Voting Classifiers
    * Deep Learning -> CNN, RNN, LSTM, Gated Recurrent Units (GRU), Transformer Models, Capsule Networks, Variational Autoencoders (VAE), Generative Adversarial Networks (GAN)
    * Anomaly Detection -> Isolation Forest, One-Class SVM, Autoencoders, Local Outlier Factor (LOF)
    * Dimensionality Reduction -> Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), t-Distributed Stochastic Neighbor Embedding (t-SNE), Autoencoders
    * Feature Selection -> Recursive Feature Elimination (RFE), Feature Importance (e.g., from tree-based models), Lasso Regression
* AI algorithms

* research topic
    * how ai can reduced improve the environment

### Vision input output

raw image -> numerical encoding -> model -> output -> predicted output <br>

* input shapes
```[batch_size, width, height, color_channel]``` ex: [None, 224, 224, 3], [32, 224, 224, 3]

batch_size is problem specific depending on hardware

* output shape: number of class [n, ....]

Different input shapes (for different frameworks):
1. shape -> [None, 28, 28, 1] (NHWC)
2. shape -> [None, 1, 28, 28] (NCHW)

##### Path to build
* Get the data ready (convert to tensors)
* Build or pick a model - Pick a loss function, Build a training loop
* Fit the model to the data and make prediction
* Evaluate the model
* Improve through experimentation
* Save and reload the trained model

#### CNN
- Convolutional layer - 'requires input data, filter (feature detector/kernel), featuremap'
`nb`: we have a feature detector, also known as a kernel or a filter, which will move across the receptive fields of the image, checking if the feature is present. This process is known as a convolution.
- Pooling layer
- Fully-connected layer

#### Libs
* `torchvision.datasets` - get datasets and data loading function for computer vision here
* `torchvision.models` - pytorch pre-trained modesl
* `torchvision.transforms` - functions  for manipulating vision data to be suitable for use with
* `torch.utils.data.Dataset` - Base dataset class for pytorch
* `torch.utils.data.DataLoader` - create python iterable over a

In [None]:
import torch
from torch import nn

import torchvision
from torchvision import datasets
from torchvision import transforms
from torchvision.transforms import ToTensor
%matplotlib inline
import matplotlib.pyplot as plt

import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"

1. Getting dataset
`FashionMNIST`

In [None]:
train_data = datasets.FashionMNIST(
    root='data',
    train=True,
    download=True,
    transform=torchvision.transforms.ToTensor(),
    target_transform=None
)

test_data = datasets.FashionMNIST(
    root='data',
    train=False,
    download=True,
    transform=ToTensor(),
    target_transform=None
)

In [None]:
len(train_data), len(test_data)

In [None]:
image, label = train_data[0]

type(image)

In [None]:
class_names = train_data.classes
class_to_idx = train_data.class_to_idx
class_to_idx

In [None]:
from PIL import Image

image, label = train_data[0]
print(f'Image size: {image.shape}')

# Image.open(image.permute(1, 2, 0)).show()
# torch.Tensor.toPILImage(image)

plt.imshow(image.permute(1,2,0), cmap='gray')
plt.title(label=class_names[label])
plt.axis(False)

# _image = image.numpy()

# plt.imshow(_image)
# plt.title(label=class_names[label])
# plt.axis(False)

In [None]:
# plot more
torch.manual_seed(42)
fig = plt.figure(figsize=(9, 9))
rows, cols = 4, 5
for i in range(1, rows*cols+1):
    random_idx = torch.randint(0, len(train_data), size=(1, )).item()
    # print(random_idx)
    img, label = train_data[random_idx]
    fig.add_subplot(rows, cols, i)
    plt.imshow(img.squeeze(), cmap='gray')
    plt.title(class_names[label])
    plt.axis(False)

In [None]:
train_data, test_data

#### 2. prepare dataloader

Now dataset is in form of pytorch datasets.
DataLoader turns dataset into a python iterable or turn into mini-batches

1. its more computationally efficient as in computing hardware
2. it gives our neural network more chanaces to update its gradients per epoch

dataloader: https://pytorch.org/docs/stable/data.html

In [None]:
from torch.utils.data import DataLoader

# setup batch size hyperparameter
BATCH_SIZE = 32

# turn datasets intor iterables
train_dataloader = DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)
test_dataloader = DataLoader(dataset=test_data, batch_size=BATCH_SIZE, shuffle=False)

train_dataloader, test_dataloader

In [None]:
next(iter(['a', 'b', 'c', 'd']))


In [None]:
print(f'Dataloader: {train_dataloader, test_dataloader}')
print(f'Length: train - {len(train_dataloader)} batches of {BATCH_SIZE}')
print(f'Lenght: test - {len(test_dataloader)} batches of {BATCH_SIZE}')

In [None]:
train_features_batch, train_labels_batch = next(iter(train_dataloader))
print(train_features_batch.shape, train_labels_batch.shape)
# show a sample from dataloader
torch.manual_seed(42)
random_idx = torch.randint(0, len(train_features_batch), size=[1]).item()
img, label = train_features_batch[random_idx], train_labels_batch[random_idx]
plt.imshow(img.squeeze(), cmap='gray')
plt.title(class_names[label])
plt.axis(False)
print(f'Image size: {img.shape}')
print(f'Label: {label}, label shape: {label.shape}')

In [None]:
len(train_features_batch)

#### Create model0

build a baseline model

In [None]:
# flatten layer
flatten_model = nn.Flatten()

# get a single sample
x = train_features_batch[0]

# flatten the sample
output = flatten_model(x)

print(f'Shape before flattening: {x.shape}')
print(f'Shape after flattening: {output.shape}')

In [None]:
input = torch.randn(32, 1, 5, 5)

# flatten = nn.Flatten() 
# output1 = flatten(input)
# output2 = flatten(output1)
# output2.size()
# print(input)
# output.squeeze()

In [None]:
from torch import nn

class FashionMNISTModelV0(nn.Module):
    def __init__(self, input_shape: int, hidden_units: int,  output_shape: int):
        super().__init__()
        self.layer_stack = nn.Sequential(
            nn.Flatten(),
            nn.Linear(in_features=input_shape, out_features=hidden_units),
            nn.Linear(in_features=hidden_units, out_features=output_shape),
        )

    def forward(self, x):
        return self.layer_stack(x)

In [None]:
torch.manual_seed(42)

# setup model with input parameters
model_0 = FashionMNISTModelV0(
    input_shape=28*28, # or 784
    hidden_units=10, # how many units in the hidden layer
    output_shape=len(class_names) # one for every class
).to('cpu')

model_0

In [None]:
dummy_x = torch.rand([1, 1, 28, 28])

model_0(dummy_x)

In [None]:
model_0.state_dict()

#### 3.1 Setup loss, optimizer and evaluation metrics

* Loss function - `nn.CrossEntropyLoss()`
* Optimizer - `torch.optim.SGD()`
* Evaluation metric - since we are working on classification problem, lets use accuracy as our evaluation metric

In [None]:
import requests
from pathlib import Path

urls = 'https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/helper_functions.py'

# download helper function
if Path('helper_functions.py').is_file():
    print('helper_functions.py already exists.')
else:
    print('Downloading helper_function.py')
    request = requests.get(url=urls)
    with open('helper_functions.py', 'wb') as file:
        file.write(request.content)


In [None]:
# import accuracy metric
from helper_functions import accuracy_fn

# setup loss funtion and optimizer
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(params=model_0.parameters(), lr=0.1)


### 3.2 Creating a function to time our experiments

ml is experimental.

2 main things often want to track are:
1. Models performance (loss and accuracy value)
2. How fast it runs


In [None]:
from timeit import default_timer as timer

def print_train_time(start: float, end: float, device: torch.device = None):
    """ difference between start and end time """
    total_time = end - start
    print(f'Train time on {device}: {total_time:.3f} seconds')
    return total_time


In [None]:
start_time = timer()

end_time = timer()
print_train_time(start=start_time, end=end_time, device='cpu')


#### Creating a training loop and trainning a model on batches of data
The optimizer will update a models parameters once per batch rather than once per epoch

## Creating a training loop and training a model on batches of data
1. Loop through epochs
2. Loop through training batches, perform training steps, calculate the train loss *per batch*
3. Loop through testing batches, perform testing steps, calculate the test loss *per batch*
4. print out whats happening
5. Time it all

In [None]:
#import tqdm for progress bar
from tqdm.auto import tqdm

# set the seed and start the timer
torch.manual_seed(42)
train_time_start_on_cpu = timer()

# set the number of epochs (we'll keep this small for faster training time)
epochs = 3

# create the training and test loop
for epoch in tqdm(range(epochs)):
    print(f'Epoch: {epoch}\n-------------')
    ### training
    train_loss = 0

    # add a loop through the training batches
    for batch, (X, y) in enumerate(train_dataloader):
        model_0.train()
        # 1. Forward pass
        y_pred = model_0(X)

        # 2. Calculate the loss (per batch)
        loss = loss_fn(y_pred, y)
        train_loss += loss # accumulate train loss

        # 3. optimizer zero grad
        optimizer.zero_grad()

        # 4. loss backward
        loss.backward()

        # 5. Optimizer step
        optimizer.step()

        if batch % 400 == 0:
            print(f'Looked at {batch * len(X)}/{len(train_dataloader.dataset)}')

# divide total train loss by lenght of train dataloader
train_loss /= len(train_dataloader)


### Testing loop
test_loss, test_acc = 0, 0
model_0.eval()
with torch.inference_mode():
    for X_test, y_test in test_dataloader:
        # 1. forward pass
        # test_pred_logits = model_0(X_test)
        test_pred = model_0(X_test)

        # 2. Calculate loss (accumulatively)
        test_loss += loss_fn(test_pred, y_test)

        # 3. Calculate accuracy
        test_acc += accuracy_fn(y_true=y_test, y_pred=test_pred.argmax(dim=1))

    # Calculate the test loss average per batch
    test_loss /= len(test_dataloader)

    # Calculate the test acc average per batch
    test_acc /= len(test_dataloader)

# Prints whats happen
print(f'\n Train loss: {train_loss:.4f} | Test Loss: {test_loss:.4f}, Test acc: {test_acc:.4f}')

train_time_end_on_cpu = timer()

total_train_time_model_0 = print_train_time(start=train_time_start_on_cpu,
                                            end=train_time_end_on_cpu,
                                            device=str(next(model_0.parameters()).device))


### Get predictions and model_0 results

In [None]:
torch.manual_seed(42)

def eval_model(model: torch.nn.Module,
               data_loader: torch.utils.data.DataLoader,
               loss_fn: torch.nn.Module,
               accuracy_fn,
               device=None):
    """ Returns a dictionary containing the results of model predicting on data_loader """
    loss, acc = 0, 0
    model.eval()
    with torch.inference_mode():
        for X, y in data_loader:
            X, y = X.to(device), y.to(device)
            # make predictions
            y_pred = model(X)

            # Accumulate the loss and acc values per batch
            loss += loss_fn(y_pred, y)
            acc += accuracy_fn(y_true=y, y_pred=y_pred.argmax(dim=1))
        
        # Scale loss and acc to fund the average loss/acc per batch
        loss /= len(data_loader)
        acc /= len(data_loader)
    return {"model_name": model.__class__.__name__,
            "model_loss": loss.item(),
            "model_acc": acc}


# Calculate model 0 results on test datasets
model_0_results = eval_model(model=model_0, 
                            data_loader=test_dataloader, 
                            loss_fn=loss_fn,
                            accuracy_fn=accuracy_fn,
                            device='cpu')

model_0_results

#### Setup device agnostic code (GPU)

In [None]:
torch.cuda.is_available()
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'

### building model with non-linearrity

In [None]:
# create a model with non-linear and linear layers

class FashionMNISTModelV1(nn.Module):
    def __init__(self, input_shape: int,
                hidden_units: int,
                output_shape: int):
        super().__init__()
        self.layer_stack = nn.Sequential(
            nn.Flatten(), # input into single vector
            nn.Linear(in_features=input_shape, out_features=hidden_units),
            nn.ReLU(),
            nn.Linear(in_features=hidden_units, out_features=output_shape),
            nn.ReLU()
        )

    def forward(self, x: torch.Tensor):
        return self.layer_stack(x)

In [None]:
next(model_0.parameters()).device

In [None]:
# create an instance model
torch.manual_seed(42)
model_1 = FashionMNISTModelV1(input_shape=784,
                              hidden_units=10,
                              output_shape=len(class_names)).to(device) # send the gpu if available


next(model_1.parameters()).device

#### 6.1 Setup Loss, Optimizer and Evaluation metrics

In [None]:
from helper_functions import accuracy_fn

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(params=model_1.parameters(), lr=0.1)


#### Functionizing training and evaluation / testing loop
* training loop - train_step()
* testing_loop - test_step()

In [None]:
def train_step(model: torch.nn.Module, 
               data_loader: torch.utils.data.DataLoader,
               loss_fn: torch.nn.Module,
               optimizer: torch.optim.Optimizer,
               accuracy_fn,
               device: torch.device = device):
    # Traning
    
    """Performs a training with model trying to learn on data_loader"""
    train_loss, train_acc = 0, 0
    
    for batch, (X, y) in enumerate(data_loader):
        model.train()
        # 1. forward pass
        y_pred = model(X)
        
        # 2. calculate loss & accuracy(per batch)
        loss = loss_fn(y_pred, y)
        train_loss += loss
        train_acc += accuracy_fn(y_true=y, y_pred=y_pred.argmax(dim=1))
        
        # 3. optimizer zero grad
        optimizer.zero_grad()
        
        # 4. loss backward
        loss.backward()
        
        # 5. optimizer step (update the models parameters once *per batch*)
        optimizer.step()
        
        # print out whats happen
        # if batch % 400 == 0:
        #     print(f'Looked at {batch * len(X)}/{len(data_loader.dataset)} samples.')
            
    # devide the total train loss and accuracy of data_loader
    train_loss /= len(data_loader)
    train_acc /= len(data_loader)
    print(f'Train loss: {train_loss:.5f} | Train acc: {train_acc:.2f}%\n')

In [None]:
def test_step(model: torch.nn.Module,
              data_loader: torch.utils.data.DataLoader,
              loss_fn: torch.nn.Module, 
              accuracy_fn,
              device: torch.device = device):
    
    test_loss, test_acc = 0, 0
    
    # model in eval mode
    model.eval()
    
    # turn on inference mode context manager
    with torch.inference_mode():
        for X_test, y_test in data_loader:    
            # send the data to the target device
            X_test, y_test = X_test.to(device), y_test.to(device)
            # 1. forward pass (output raw logits)
            test_pred = model(X_test)
            # 2. calculate the loss & accuracy
            loss = loss_fn(test_pred, y_test)
            test_loss += loss
            test_acc += accuracy_fn(y_true=y_test, y_pred=test_pred.argmax(dim=1))
        
        # adjust matrics and printout
        test_loss /= len(data_loader)
        test_acc /= len(data_loader)
        print(f'Test loss: {test_loss:.5f} | Test acc: {test_acc:.2f}%\n')

In [None]:
torch.manual_seed(42)

# measure time
from timeit import default_timer as Timer
train_time_start_on_cpu = timer()

# set epochs
epochs = 20

# create a optimization and evaluation loop using train_step() and test_step()
for epoch in tqdm(range(epochs)):
    print(f'-----------Epoch: {epoch}---------\n')
    train_step(
        model=model_1,
        data_loader=train_dataloader,
        loss_fn=loss_fn,
        optimizer=optimizer,
        accuracy_fn=accuracy_fn,
        device=device
    )
    
    test_step(
        model=model_1,
        data_loader=test_dataloader,
        loss_fn=loss_fn,
        accuracy_fn=accuracy_fn,
        device=device
    )

train_time_end_on_cpu = timer()
total_train_time_model_1 = print_train_time(
    start=train_time_start_on_cpu, 
    end=train_time_end_on_cpu,
    device=device
)

In [None]:
model_0_results

In [None]:
print(total_train_time_model_0)
print(total_train_time_model_1)

In [None]:
# get model_1 results dictionary
model_1_results = eval_model(model=model_1,
                            data_loader=test_dataloader,
                            loss_fn=loss_fn,
                            accuracy_fn=accuracy_fn)

model_1_results

#### Model 2: building convolutional neural network (cnn)

* CNN's are also known ConvNets.
* CNN's are known for their capabilities to find patterns in visual data

https://poloclub.github.io/cnn-explainer/

In [None]:
### build CNN
class FashionMNISTModelV2(nn.Module):
    """
        Model architecture that replicates the TinyVGG
    """
    def __init__(self, input_shape: int, hidden_units: int, output_shape: int):
        super().__init__()
        self.conv_block_1 = nn.Sequential(
            # check hyperparameters
            nn.Conv2d(in_channels=input_shape, 
                    out_channels=hidden_units,
                    kernel_size=3,
                    stride=1,
                    padding=1),
            nn.ReLU(),
            nn.Conv2d(
                in_channels=hidden_units,
                out_channels=hidden_units,
                kernel_size=1,
                stride=1,
                padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2)
        )
        self.conv_block_2 = nn.Sequential(
            nn.Conv2d(in_channels=hidden_units, 
                      out_channels=hidden_units,
                      kernel_size=3,
                      stride=1,
                      padding=1),
            nn.ReLU(),
            nn.Conv2d(in_channels=hidden_units, 
                      out_channels=hidden_units,
                      kernel_size=3,
                      stride=1,
                      padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2)
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            # For tracing this problem
            # `UserWarning: Initializing zero-element tensors is a no-op
            # warnings.warn("Initializing zero-element tensors is a no-op")`  
            # in_features=hidden_units*0 -> `hidden_units*0`
            
            nn.Linear(in_features=hidden_units*0,
                      out_features=output_shape)
        )
        
    def forward(self, x):
        x = self.conv_block_1(x)
        print(f'block-1: {x.shape}')
        x = self.conv_block_2(x)
        print(f'block-2: {x.shape}')
        x = self.classifier(x)
        return x

In [None]:
torch.manual_seed(42)
# input shape is based on image color channels
model_2 = FashionMNISTModelV2(input_shape=1,
                              hidden_units=10,
                              output_shape=len(class_names)).to(device)


In [None]:
next(model_2.parameters())

#### stepping through Conv2d

In [None]:
torch.manual_seed(42)

# create a batch of images
images = torch.randn(size=(32, 3, 64, 64))
test_image = images[0]

print(f'Image batch shape: {images.shape}')
print(f'Single image shape: {test_image.shape}')
print(f'Test image\n: {test_image}')

test_image.shape

In [None]:
# create single conv2d layer
conv_layer = nn.Conv2d(in_channels=3,
                       out_channels=10,
                       kernel_size=(2, 2),
                       stride=1,
                       padding=0)

# pass the data through the convolutional layer
conv_output = conv_layer(test_image)
conv_output.shape

In [None]:
test_image.unsqueeze(0)

#### stepping through MaxPool2d

In [None]:
print(f'Test image original shape: {test_image.shape}')
print(f'Test image with unsqueeze dimension: {test_image.unsqueeze(0).shape}')

# 17:59:41