# Creating tensor and manipulating them

## Creating tensors and basic properties

**Create Tensors**
* `torch.tensor()`: create random tensor with given structure and numbers
* `torch.rand()`: create random tensor with given dimensions
* `torch.zeros()`: create a tensor filled with zeros
* `torch.ones()`: create a tensor filled with ones
* `torch.arange()`: create a range (similar to function `range` but output is a tensor)

**Attributes**
* `.dtype`: data type
* `.type()`: assign a new type
* `.shape`: shape of the tensor
* `.device`: on which device the tensor lives

**Misc**
* `torch.manual_seed()`: to reset the seed


In [None]:
import tensor

print(torch.tensor[7, 7])
print(torch.tensor[1, 2], [3, 4]])
print(torch.tensor[[[1, 2], [3, 4]], [[5, 6], [7, 8]]])


## Manipulating tensors

* `.reshape()`: change the shape of the tensor
* `.view()`: change the view: creates a new view, the 2 tensors data are the same, changing one tensor changes the other as well but the shape of the view will be different
* `.stack()`: stack tensors of compatible dimensions
* `.permute()`: change order of dimensions, *useful to move colour channel first to last and viceversa*
* `.squeeze()` and `.unsqueeze()`: remove or add dimensions to a tensor


# PyTorch `nn.Module()`

* `nn.Module()`: class to define models: when subclassing they need a self and a forward method inside
* `nn.Parameters()`: to manually define parameters
* loss functions:
  * `nn.L1Loss`: MSE loss for fitting linear models
  * `nn.CrossEntropyLoss`: cross entropy for multi-class classification
  * `nn.BCEWithLogitsLoss()`: for binary classification, includes sigmoid activation function. Outputs logits, use `torch.sigmoid()` to transform into predictions/probabilities
* `torch.optim.SGD()`: Stochastic Gradient Descent algorithm

**`nn.Module` possible transformations**
* `nn.Sequential()`: to put together several transformations
* `nn.Linear()`: for linear transformation (e.g. simple linear regression)
* `nn.Conv2d()`: convolution step
* `nn.ReLU()`: rectified linear activation function $max(0,x)$
* `nn.Flatten()`: transform a multi-dimensional tensor in a vector
* `nn.MaxPool2d()`: take maximum over a square of pixels and reduce dimensions
* ``:
* ``:

**Methods and Attributes for a model:**
* `a_model.state_dict()`: to get dictionary of parameters
* `a_model.eval()`, `a_model.train()`: eval and train status
* `with torch.inference()`: to turn off gradients, necessary when forecasting or calculating test performance


## TinyVGG architecture model

```python
class TinyVGGArchitecture(nn.Module):
  """
  Model architecture replicating TinyVGG
  from CNN explainer website.
  """
  def __init__(self,
               input_shape: int,
               hidden_units: int, # number of hidden units, it's not the size of each concoluted picture
               output_shape: int):
    super().__init__()
    # architecure: multiple blocks
    # convolutional blocks: multiple layers
    self.conv_block_1 = nn.Sequential(
        nn.Conv2d(in_channels=input_shape, # convolutional 2 dimensional
                  out_channels=hidden_units,
                  kernel_size=3,
                  stride=1,
                  padding=1), # we set these values in NN
        nn.ReLU(),
        nn.Conv2d(in_channels=hidden_units, # convolutional 2 dimensional
                  out_channels=hidden_units,
                  kernel_size=3,
                  stride=1,
                  padding=1),
        nn.ReLU(),
        nn.MaxPool2d(kernel_size=2,
                     stride=2) # by default same as kernel size
    )
    self.conv_block_2 = nn.Sequential(
        nn.Conv2d(in_channels=hidden_units,
                  out_channels=hidden_units,
                  kernel_size=3,
                  stride=1,
                  padding=1),
        nn.ReLU(),
        nn.Conv2d(in_channels=hidden_units,
                  out_channels=hidden_units,
                  kernel_size=3,
                  stride=1,
                  padding=1),
        nn.ReLU(),
        nn.MaxPool2d(kernel_size=2)
    )
    # last block needs to output a classifier
    self.classifier = nn.Sequential(
        nn.Flatten(),
        nn.Linear(in_features=hidden_units*7*7,
                  out_features=output_shape)
    )
  def forward(self, x):
    x = self.conv_block_1(x)
    # print(x.shape) # to help get the right size in the Linear layer
    x = self.conv_block_2(x)
    # print(x.shape)
    x = self.classifier(x)
    # print(x.shape)
    return x
```

## Fitting a model

1. Set up number of epochs (iteration)
1. Set up epochs loop
1. Set up loop through batches in a DataLoader
1. `a_model.train()`: Get model in train mode
1. `a_model(X_data)`: Do a forward pass
1. `loss_fn(y_pred, y_test)`: Calculate the train loss
  * Maybe necessary to transform the output: e.g. from logit to probability
1. `optimizer.zero_grad()`: Reset the optimizer
1. `loss_fn.backward()`: Perform loss propagation backward
1. `optimizer.setp()`: Perform optimizer step

```python
# set the timer
torch.manual_seed(42)
train_time_start_on_cpu = timer()

# set number of epochs
epochs = 3

# create training and test loop
for epoch in tqdm(range(epochs)):
  print(f"Epoch: {epoch}\n---------")
  ### Training
  train_loss = 0 # cumulates loss per batch
  # Loop through batches
  for batch, (X, y) in enumerate(train_dataloader):
    model_0.train()
    # forward pass
    y_pred = model_0(X)
    # loss
    loss =loss_fn(y_pred, y)
    train_loss += loss # accumulates the train loss
    # optimizer reset
    optimizer.zero_grad()
    # loss backward
    loss.backward()
    # optimizer step: updating model parameters once per BATCH
    optimizer.step()
    if batch % 400 == 0:
      print(f"Looked at {batch * len(X)}/{len(train_dataloader.dataset)} samples.")

  # back to epoch loop
  # divide loss by length dataloader
  train_loss /= len(train_dataloader)

  # testing loop
  model_0.eval()
  test_loss, test_acc = 0, 0
  with torch.inference_mode():
    for X_test, y_test in test_dataloader:
      # forward pass
      test_pred = model_0(X_test)
      # loss
      test_loss += loss_fn(test_pred, y_test)
      # accuracy
      test_acc += accuracy_fn(y_true=y_test, y_pred=test_pred.argmax(dim=1))
    # calculate the test loss average per batch
    test_loss /= len(test_dataloader)
    # accuracy average
    test_acc /= len(test_dataloader)

  print(f"\nTrain loss: {train_loss:.4f} | Train acc: {test_acc:.2f}%\nTest loss: {test_loss:.4f} | Test acc: {test_acc:.2f}%")
```

# Loading and creating datasets

## DataLoader

`from torch.utils.data import DataLoader` to create batches of data as it's computationally impossible to use all images at the same time. Good batch size are powers of 2, like 32 or 64.
* use `next(iter(aDataLoader))` to access one batch of data/images

# Evaluating models

## Torchvision

* `import datasets`: contains datasets
* `import transform`: contains transformation to adapt images to correct format/size or to augment data

## Torchmetrics

```python
try:
  import torchmetrics
except:
  !pip install -q torchmetrics
  import torchmetrics
```

Contains functions to help evaluate models
* `Accuracy()`

### Confusion matrix

```python
from torchmetrics import ConfusionMatrix
from mlxtend.plotting import plot_confusion_matrix
```




# `sklearn` useful functions

* `from sklearn.datasets import make_circles, moons, make_blobs`: to create artifical datasets
* `from sklearn.model_selection import train_test_split`: to split dataset into train and test datasets

# Misc

* `from tqdm,auto import tqdm`: to have a progress bar when running a loop
* `from timeit import default_timer as timer`: to get system time