<a href="https://colab.research.google.com/github/neelsoumya/intro_to_LMMs/blob/main/LLaMA_lora_int8.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# [xturing](https://github.com/stochasticai/xturing) - LLama efficient fine-tuning tutorial

This tutorial aims to show how easy it is to perform fine-tuning with xturing. This notebook shows how to finetune LLaMA 7B model on GPU which has limited memory, it requires only 9GB Vram

## 1. Install the `xturing` library

In [None]:
!pip install xturing --upgrade

Collecting xturing
  Downloading xturing-0.1.8-py3-none-any.whl.metadata (25 kB)
Collecting pytorch-lightning (from xturing)
  Downloading pytorch_lightning-2.5.1-py3-none-any.whl.metadata (20 kB)
Collecting transformers==4.31.0 (from xturing)
  Downloading transformers-4.31.0-py3-none-any.whl.metadata (116 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.9/116.9 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting datasets==2.14.5 (from xturing)
  Downloading datasets-2.14.5-py3-none-any.whl.metadata (19 kB)
Collecting evaluate==0.4.0 (from xturing)
  Downloading evaluate-0.4.0-py3-none-any.whl.metadata (9.4 kB)
Collecting bitsandbytes==0.41.1 (from xturing)
  Downloading bitsandbytes-0.41.1-py3-none-any.whl.metadata (9.8 kB)
Collecting deepspeed==0.9.5 (from xturing)
  Downloading deepspeed-0.9.5.tar.gz (809 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m809.9/809.9 kB[0m [31m26.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing 

## 2. Download and unzip the dataset

In [None]:
!wget https://d33tr4pxdm6e2j.cloudfront.net/public_content/tutorials/datasets/alpaca_data.zip
!unzip alpaca_data.zip

In [None]:
!pip install accelerate

## 3. Load the dataset and initialize the model

In [None]:
from xturing.datasets.instruction_dataset import InstructionDataset
from xturing.models.base import BaseModel

instruction_dataset = InstructionDataset("/content/alpaca_data")
# Initializes the model
model = BaseModel.create("llama_lora_int8")

## 4. Start the finetuning

In [None]:
# Finetuned the model
model.finetune(dataset=instruction_dataset)

## 5. Generate an output text with the fine-tuned model

In [None]:
# Once the model has been finetuned, you can start doing inferences
output = model.generate(texts=["Why LLM models are becoming so important?"])
print("Generated output by the model: {}".format(output))

In [None]:
generation_config = model.generation_config()
print(generation_config)

In [None]:
generation_config.top_k = 1

In [None]:
# Once the model has been finetuned, you can start doing inferences
output = model.generate(texts=["Why LLM models are becoming so important?"])
print("Generated output by the model: {}".format(output))

## Do you have any questions?

You can open an issue in our [GitHub repo](https://github.com/stochasticai/xturing)
