Natural Language Processing (NLP) and Large Language Models (LLM) with Fine-Tuning LLM and Trainer with DeepSpeed
In this notebook we're going to Fine-Tuning LLM:
Many LLMs are general purpose models trained on a broad range of data and use cases. This enables them to perform well in a variety of applications, as shown in previous modules. It is not uncommon though to find situations where applying a general purpose model performs unacceptably for specific dataset or use case. This often does not mean that the general purpose model is unusable. Perhaps, with some new data and additional training the model could be improved, or fine-tuned, such that it produces acceptable results for the specific use case.
Fine-tuning uses a pre-trained model as a base and continues to train it with a new, task targeted dataset. Conceptually, fine-tuning leverages that which has already been learned by a model and aims to focus its learnings further for a specific task.
It is important to recognize that fine-tuning is model training. The training process remains a resource intensive, and time consuming effort. Albeit fine-tuning training time is greatly shortened as a result of having started from a pre-trained model.
By the end of this notebook, you will be able to:
- Prepare a novel dataset
- Fine-tune the
t5-small
model to classify movie reviews. - Using DeepSpeed