# LLM Finetune - Mistral 7B

This studio will let you fine-tune a Mistral 7B model, and request an inference through the API. 

To start, make sure you **switch to a A10G machine**.

In [1]:
import torch
assert torch.cuda.is_available(), "Please switch to a GPU instance (1xA10G recommended)"



### Prepare the dataset

First off, download and prepare the dataset. The following defaults to the Alpaca dataset (`--dataset alpaca`), but you can fine-tune on the Dolly dataset (`--dataset dolly`), or bring your own CSV (`--dataset csv`).

In the latter case, you need to provide a CSV file with the following 3 columns
```
instruction input output
```
and pass it as the `--csv_path <data.csv>` argument to the script.

In [18]:

dataset_name = "alpaca_small"
lora_path = f"out/lora/{dataset_name}/lit_model_lora_finetuned.pth"

In [None]:

!python llm-finetune/prepare_dataset.py --dataset f"{dataset_name}"

### Enable Flash Attention 2 (Optional)

Install Pytorch Nightly 2.2 to enable FA2

In [None]:
# use flash attention 2
# https://github.com/Lightning-AI/lit-gpt/issues/637
# !pip uninstall -y torch
# !pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121

### Fine-tune the base Mistral 7B model on the dataset

You can now fine-tune your model on the data. This script will automatically run across all available GPUs.

In [None]:
# finetune.py hardcodes the number of iterations to 50k. You can make this match your dataset.
!python llm-finetune/finetune.py --dataset f"{dataset_name}"

### Try generating text

You can now generate text using your fine-tuned model. Feel free to adjust the prompt as needed.

In [14]:
# The code above saved this out to : 'out/lora/{dataset_name}/lit_model_lora_finetuned.pth'

# But the code below (generate_lora.py) doesn't seem to know about that output folder and assumes it's dolly. Here's what that code looks like right now:

# def generate(
#     model_name: str = "mistralai/Mistral-7B-v0.1",
#     prompt: str = ""
# ):
#     lora.main(
#         prompt = prompt,
#         checkpoint_dir = Path("llm-finetune/lit-gpt/checkpoints") / model_name,
#         lora_path = Path("out/lora/dolly/lit_model_lora_finetuned.pth"),
#         precision = "bf16-true",
#         quantize = "bnb.nf4",
#     )

# Original way of calling it:
#!python llm-finetune/generate_lora.py  --prompt "Is pineapple pizza any good?"

# Fixed way to call it:
!python llm-finetune/generate_lora.py --lora_path "{lora_path}"  --prompt "Is pineapple pizza any good?"

### Produce a checkpoint

You can merge your fine-tuned layers on the base model, obtaining a self-contained checkpoint.

In [None]:
!python llm-finetune/merge_lora.py --lora_path "{lora_path}"


In [None]:
# TODO: generate using this merged file
