Transfer Learning of Multimodal Models

In the preceding sections, we've delved into the fundamental concepts of multimodal models such as CLIP and its related counterparts. In this chapter, we will try to find out how you can use different types of multimodal models for your tasks.

There are several approaches to how you can adapt multimodal models to your use case:

Zero\few-shot learning. Zero\few-shot learning involves leveraging large pretrained models capable of solving problems not present in the training data. These approaches can be useful when there is little labeled data for a task (5-10 examples) or there is none at all. Unit 11 will delve deeper into this topic.
Training the model from scratch. When pre-trained model weights are unavailable or the model's dataset substantially differs from your own, this method becomes necessary. Here, we initialize model weights randomly (or via more sophisticated methods like He initialization) and proceed with the usual training. However, this approach demands substantial amounts of training data.
Transfer learning. Transfer learning, unlike training from scratch, uses the weights of the pretrained model as initial weights.

This chapter primarily focuses on the transfer learning aspect within multimodal models. It will recap what transfer learning entails, elucidate its advantages, and provide practical examples illustrating how you can apply transfer learning to your tasks!

Transfer learning

More formally, transfer learning is the set of machine learning techniques in which knowledge, representations or patterns that are obtained from solving one problem are reused to solve another, but similar problem.

In the context of deep learning, in transfer learning, when training a model for a particular task, we use the weights of another model as the initial weights. The pretrained model has typically been trained on a huge amount of data and has useful knowledge about the nature and relationships in that data. This knowledge is embedded in the weights of this model, and therefore if we use them as initial weights, we transfer the knowledge embedded in the pretrained model into the model we are training.

This approach has several advantages:

Resource Efficiency: Since the pretrained model was trained on a huge amount of data over a long period, transfer learning requires much less computing resources for model convergence.
Reducing the size of labeled data: For the same reason, less data is required to achieve decent quality on the test sample.
Knowledge Transfer: When fine-tuning to the new task, the model capitalizes on the pre-existing knowledge encoded within the pre-trained model's weights. This integration of prior knowledge often leads to enhanced performance on the new task.

However, despite its advantages, transfer learning has some challenges that should also be taken into account:

Domain Shift: Adapting knowledge from the source domain to the target domain can be challenging if the data distributions differ substantially.
Catastrophic forgetting: During fine-tuning process, the model adjusts its parameters to adapt to the new task, often causing it to lose the previously learned knowledge or representations related to the initial task.

Transfer Learning Applications

We'll explore practical applications of transfer learning across various tasks. Navigate to the Jupyter notebook relevant to your task of interest from the provided table.

Task	Description	Model	Notebook
Fine-tune CLIP	Fine-tuning CLIP on a custom dataset	openai/clip-vit-base-patch32	CLIP notebook
VQA	Answering a question in natural language based on an image	dandelin/vilt-b32-finetuned-vqa	VQA notebook
Image-to-Text	Describing an image in natural language	Salesforce/blip-image-captioning-large	Text 2 Image notebook
Open-set object detection	Detect objects by natural language input	Grounding DINO	Grounding DINO notebook
Assistant (GTP-4V like)	Instruction tuning in the multimodal field	LLaVA	LLaVa notebook

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transfer_learning.mdx

transfer_learning.mdx

Transfer Learning of Multimodal Models

Transfer learning

Transfer Learning Applications

Files

transfer_learning.mdx

Latest commit

History

transfer_learning.mdx

File metadata and controls

Transfer Learning of Multimodal Models

Transfer learning

Transfer Learning Applications