# Transfer Learning in Deep Learning

### What is Transfer Learning

Transfer learning is an approach to machine learning where a model trained on one task is used as the starting point for a model on a new task. This is done by transferring the knowledge that the first model has learned about the features of the data to the second model.

In deep learning, transfer learning is often used to solve problems with limited data. This is because deep learning models typically require a large amount of data to train, which can be difficult or expensive to obtain.

![image.png](attachment:image.png)

### Why Use Transfer Learning?

Here are some reasons why you might want to use transfer learning:

- `To save time and resources`: Training a deep learning model from scratch can be time-consuming and computationally expensive. Transfer learning can help you save time and resources by starting with a model that has already been trained on a large dataset.
- `To improve model performance`: Transfer learning can help you improve the performance of your model by transferring the knowledge that the pre-trained model has learned about the features of the data. This can be especially helpful if you have limited data for your target task.

### Types of Transfer Learning

Transfer learning can be classified into two types:

![image.png](attachment:image.png)

- `Feature extraction`: In feature extraction, the pre-trained model is used to extract features from the data. These features are then used to train a new model on the target task. This is a good approach if you have limited data for the target task.

### Let's see how pre-trained model is used in Transfer Learning.

In [1]:
# Libraries
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression


In [2]:
# Load pre-trained model
pre_trained_model = TfidfVectorizer()
pre_trained_model.fit(["This is a positiv revied", "This is a negative review"])

In [3]:
# extract features from the data
X_train = pre_trained_model.transform(["I love this product", "This product is terrible"])

In [5]:
# Train a model in test set
y_train = np.array([0,1])
lr = LogisticRegression()
lr.fit(X_train, y_train)

In [6]:
# Make predictions on new data
X_new = pre_trained_model.transform(['This product is the best!'])
y_pred = lr.predict(X_new)

In [7]:
# Print the prediction
print(y_pred) 

[1]


This approach is useful if you have limited data for the target task. For example, if you have a small number of labeled reviews, you can use a pre-trained model to extract features from the reviews.

- `Fine-tuning`: Fine-tuning is a machine learning technique in which a pre-trained model is further trained on a new dataset to improve its performance on a specific task. The pre-trained model is typically trained on a large dataset of general data, while the new dataset is specific to the task at hand.

### What Is a Pre-Trained Model?

A pre-trained model is a machine learning model that has been trained on a large dataset of data. This dataset is typically much larger than the dataset that will be used to train the final model. The pre-trained model learns to extract features from the data, and these features can be used to train the final model more quickly and efficiently.

### Popular Pre-Trained Architectures

There are many popular pre-trained architectures, but some of the most common include:

- `VGG (Visual Geometry Group)`: VGG is a family of convolutional neural networks that were first introduced in 2014. They are known for their simplicity and efficiency, and they have been used for a variety of tasks, including image classification, object detection, and segmentation.

![image.png](attachment:image.png)

- ResNet (Residual Network): ResNet is a family of convolutional neural networks that were introduced in 2015. They are known for their ability to learn deeper features, and they have been shown to achieve state-of-the-art results on a variety of tasks.

![image.png](attachment:image.png)

- BERT (Bidirectional Encoder Representations from Transformers): BERT is a language model that was introduced in 2018. It is known for its ability to learn long-range dependencies in text, and it has been used for a variety of natural language processing tasks, including question-answering, sentiment analysis, and text summarization.

![image.png](attachment:image.png)

### Fine-tuning vs. Feature Extraction

In transfer learning, fine-tuning and feature extraction are two common techniques to adapt a pre-trained model to a new task or domain:

`Fine-tuning`: Fine-tuning involves taking a pre-trained model (often on a source task) and training it further on a target task. During fine-tuning, the model’s weights are updated using the target task’s data while retaining some knowledge from the source task. Fine-tuning is particularly useful when the source and target tasks are closely related.

`Feature Extraction`: Feature extraction refers to using a pre-trained model as a fixed feature extractor. Instead of modifying the model’s weights, the model is used to extract relevant features from the input data. These features can then be fed into a new classifier or model specific to the target task.