# TRANSFER LEARNING

Modern deep neural networks are data-hungry. They require very large datasets, with millions of items, to reach their peak performance.

Unfortunately, developing such large datasets from scratch for every use case of deep learning is very expensive and often not feasibe.

**Transfer learning** is a technique that allows you take a neural network that has been already trained of one of these very large datasets, and tweak it slightly to adapt it to a new dataset.

### Innovative CNN Architecture
Some innovative CNN architectures that accomplished significant breakthroughs in the ImageNet competitions:
* **AlexNet:**  ReLU activation function. AlexNet also used DropOut to prevent overfitting. It has the structure of a classical CNN, with a backbone made of convolution and Max Pooling followed by a flattening and a Multi-Layer Perceptron.
* **VGG:** The designers pioneered the use of many 3 by 3 convolutions instead of fewer larger kernels (for example, the first layer of AlexNet uses a 11 by 11 convolution). Therefore, he height and width of the feature maps decreases as we go deeper into the network, thanks to the Max Pooling layers, but the number of feature maps increases. The backbone is then followed by a flattening operation and a regular head made of a Multi-Layer Perceptron.
* **ResNet:** ResNet is a very important architecture that introduced a fundamental innovation: the skip connection.

#### Global Average Pooling (GAP) Layer
A classic CNN has a first section comprised of several layers of convolutions and pooling, followed by a flattening and then one or more fully-connected layers (MLP).

Fully-connected layers (head of a classic CNN) can only work with an input array of a specific size. Therefore, the vector produced by the flattening operation must have a specific number of elements, because it feeds into the fully-connected layers. Let's call this number of elements `H`. This means that the feature maps that are being flattened must have a specific size, so that `n_channels x height x width = H`. Since the height and width of the last feature maps are determined by the size of the input image, as it flows through the convolutional and the pooling layers, this constraint on the vector produced by the flattening operation translates to a constraint on the size of the input image. Therefore, for CNNs using flattening layers, the input size must be decided a priori when designing the architecture.

For GAP, instead of taking the last feature maps (in the last convolution) and flattening them into a long vector, we take the **average of each feature map** and place them in a much shorter vector. This drastically reduces the dimensionality of the resulting vector, from `n_channels x height x width` to just `n_channels`. But also, more importantly, it makes the network adaptable to any input size because the flattening only depends on the feature maps of the last convolution and not the image size!

**Note:** however that a network with GAP trained on a certain image size will not respond well to drastically different image sizes, even though it will output a result. So effectively the input size became a tunable parameter that can be changed without affecting the architecture of the CNN.

Many modern architectures adopt the GAP layer.


### Attention Layers

#### Channel Attention: Squeeze and Excitation
*The term "channel" can refer to the channels in the input image (3 channels if RGB) but also to the number of feature maps are output from a layer.*

**Channel attention** is a mechanism that a network can use to learn to pay more attention (i.e., to boost) feature maps that are useful for a specific example, and pay less attention to the others.

This is accomplished by adding a sub-network (Squeeze and Excitation) that given the feature maps/channels assigns a scale to each input feature map. The feature maps with the largest scale are boosted.

#### Self Attention: Transformers in Computer Vision


# GIT TRACKING

In [1]:
!pip install python-dotenv --quiet

In [2]:
from dotenv import load_dotenv
import os

In [3]:
notebook_name = "transfer_learning_CNN.ipynb"
repo_name = "Transfer-Learning-in-CNN"
git_username = "omogbolahan94"
email = "gabrielomogbolahan1@gmail.com"

In [4]:
def push_to_git(notebook_name, repo_name, commit_m, git_username, email):
  token_path = '/content/drive/MyDrive/Environment-Variable/variable.env'
  load_dotenv(dotenv_path=token_path)
  GITHUB_TOKEN = os.getenv('GIT_TOKEN')

  USERNAME = f"{git_username}"
  REPO = f"{repo_name}"

  # Authenticated URL
  remote_url = f"https://{USERNAME}:{GITHUB_TOKEN}@github.com/{USERNAME}/{REPO}.git"
  if REPO not in os.listdir():
    !git clone {remote_url}

  # copy notebook to the cloned CNN
  notebook_path = f"/content/drive/My Drive/Colab Notebooks/{notebook_name}"
  !cp '{notebook_path}' '/content/{REPO}/'

  # ensure to be in the repository folder
  %cd '/content/{REPO}'

  # copy the saved model into the cloned repository
  if "cifar10_best_valid.pt" not in os.listdir():
    if os.path.exists('/content/cifar10_best_valid.pt'):
      !cp /content/cifar10_best_valid.pt /content/{REPO}/
  if 'cifar10_network.pt' not in os.listdir():
    if os.path.exists('/content/cifar10_best_valid.pt'):
      !cp /content/cifar10_network.pt /content/{REPO}/

  # Reconfigure Git
  !git config --global user.name '{USERNAME}'
  !git config --global user.email '{email}'
  !git remote set-url origin '{remote_url}'

  print()
  !git add .
  !git commit -m '{commit_m}'
  !git push origin main

  # change back to the content directory
  %cd '/content'

In [5]:
commit_m = "Channel and self attention"

In [6]:
push_to_git(notebook_name, repo_name, commit_m, git_username, email)

Cloning into 'Transfer-Learning-in-CNN'...
cp: cannot stat '/content/drive/My Drive/Colab Notebooks/transfer_learning_CNN.ipynb': No such file or directory
/content/Transfer-Learning-in-CNN

On branch main

Initial commit

nothing to commit (create/copy files and use "git add" to track)
error: src refspec main does not match any
[31merror: failed to push some refs to 'https://github.com/omogbolahan94/Transfer-Learning-in-CNN.git'
[m/content
