## 3.2 Introduction to Pytorch - Transfer Learning and Hugging Face Introduction

Transfer Learning - A model trained for one task is reused as the starting point for a model for a second task
1. Select a souce model from a repository of models
2. Reuse the trained model

Fine-tuning options for a Pre-Trained model:
1. Update the whole model on labeled data plus any additional layers added on top (slowest - usually best performing)
2. Freeze a subset fo the model, only updating the model slightly. Turn off the training on anything upto the xth encoder (faster than 1 - average performance)
3. Freede the whole model and only train the additional layers added on of the pre-existing language model. Only update the feed-forward classifier that takes the output of the model and transforms it into a representation of our downstream task. The model will try it's best whith what it's already seen upto now (faster than 2 - usually worst performance)

Process of fine-tuning a Model
1. Training data: Use our training data to update the model
2. Model: The model computes a loss function which indicates how wrong or right the model is at predecting the training data. 
    * Compute Loss is uses to compute gradients
        * These gradients are used to optimze the weights which then updates the model as a whole
3. This process continues in the training cycle until we are satisfied with the models performance
Hugging Face Trainer API trainer takes care of the training loop (2 and 3)

#### Huggungface Trainer API Key Objects
**Dataset** - Holds all data and splits into training/testing set<br>
**DataCollator** - Forms batches of data from a Datasets<br>
**TrainingArguments** - Keeps track of training arguments like saving strategy and learning rate scheduler patterns<br>
**Trainer** - API to the Pytorch training loop for most starndard cases<br>

In [3]:
import torch

In [None]:
# 1-dimensional tensor

one_d_tensor = torch.LongTensor([0, 1, 2, 3, 4]) # 1-D tensor with 5 elements

print(f'Shape of {one_d_tensor} is {one_d_tensor.shape} and dimension is {one_d_tensor.dim()}')

Shape of tensor([0, 1, 2, 3, 4]) is torch.Size([5]) and dimension is 1


In [5]:
# another 1-dimensional tensor

one_d_tensor = torch.LongTensor([0, 1, 2])

print(f'Shape of {one_d_tensor} is {one_d_tensor.shape} and dimension is {one_d_tensor.dim()}')

Shape of tensor([0, 1, 2]) is torch.Size([3]) and dimension is 1


In [None]:
# 2-dimensional tensor

two_d_tensor = torch.LongTensor([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) # 2-D tensor with 3 rows and 3 columns. The shape is (3, 3)

print(two_d_tensor.shape)

print(f'Shape of {two_d_tensor} is {two_d_tensor.shape} and dimension is {two_d_tensor.dim()}')

torch.Size([3, 3])
Shape of tensor([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]]) is torch.Size([3, 3]) and dimension is 2


#### unsqueeze and squeeze
Used to convert tensors from one dimension to another dimension
* unsqueeze forces a dimension to exist - specifies a batch dimension

In [None]:
one_d_tensor = torch.LongTensor([0, 1, 2])

print(f'Shape of {one_d_tensor} is {one_d_tensor.shape} and dimension is {one_d_tensor.dim()}')

# convert 1-dimensional tensor to 2-dimensional tensor by forcing a dimension in the front
# this is useful when we want to force a "batch" dimension if we want to predict a single example
two_d_tensor = one_d_tensor.unsqueeze(0) # change the dimension from 1-D to 2-D by adding a dimension in the front. Instead of shape (3,), the shape becomes (1, 3). The 1 idicates the batch dimension of a single data point that's being passed in

print(f'Shape of {two_d_tensor} is {two_d_tensor.shape} and dimension is {two_d_tensor.dim()}')

Shape of tensor([0, 1, 2]) is torch.Size([3]) and dimension is 1
Shape of tensor([[0, 1, 2]]) is torch.Size([1, 3]) and dimension is 2


#### Method for converting a tensor to a numpy array

In [8]:
# convert from pytorch to numpy

two_d_tensor.numpy()

array([[0, 1, 2]])

In [9]:
# convert from pytorch to numpy with detach which removes a tensor from a computation graph (will be useful later)

two_d_tensor.detach().numpy()

array([[0, 1, 2]])