# Finetune Mistral on LLAMA Factory using QLoRA
### Llama Factory Supports more that 100 datasets and 50 llms, both LoRA and QLoRA and full fine-tuning.
### Base model: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
### Dataset: https://huggingface.co/datasets/MattCoddity/dockerNLcommands
### Youtube: https://www.youtube.com/watch?v=iMD7ba1hHgw&list=PLrLEqwuz-mRIEtuUEN8sse2XyksKNN4Om&index=4&ab_channel=AIAnytime

## First clone the repository

In [None]:
%cd /content/
%rm -rf LLaMA-Factory
!git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git

In [None]:
%cd LLaMA-Factory

In [None]:
%ls

In [None]:
%pwd

## Install required package to run LLaMA-Factory

In [None]:
!pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1
!pip uninstall -y jax
!pip install -e .[torch,bitsandbytes,liger-kernel]

### We will use a 4-bit quantization of the model and QLoRA to do the finetuning

In [None]:
# install bitsandbytes for the quantization
!pip install bitsandbytes

In [1]:
import json

# Path to dataset_info.json
file_path = "data/dataset_info.json"

# Load the existing JSON data
with open(file_path, "r") as file:
    data = json.load(file)

# Add the new dataset
data["docker_datasett"] = {
    "hf_hub_url": "MattCoddity/dockerNLcommands",
    "columns": {
        "prompt": "instruction",
        "query": "input",
        "response": "output"
    }
}

# Save the updated JSON
with open(file_path, "w") as file:
    json.dump(data, file, indent=2)


## Run the user interface to setup the training parameters

In [None]:
%cd LLaMA-Factory

#### Replace with your token

In [None]:
# @title
from subprocess import Popen
import os
HF_token = "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"
os.environ["HF_TOKEN"] = HF_token

env = os.environ.copy()
env["HF_HOME"] = "/root/.huggingface"
env["HF_TOKEN"] = HF_token 

Popen(["llamafactory-cli", "train", "your_args"], env=env)

In [None]:
!GRADIO_SHARE=1 llamafactory-cli webui
# !CUDA_VISIBLE_DEVICES=0 python src/webui.py

## Next:
- choose model as: Mistral-7B-Instruct-v0.1
- choose the dataset as: docker_dataset
- set Quantization bit to: 4 to enable QLoRA.
- set prompt template to mistral
- set learning rate to 2e-4
- set cut of length to 512 (to reduce computating cost and time).
- reduce max samples to 10000 (to reduce computating cost and time).
- set epochs to 1 (to reduce computating cost and time).
- keep using bf16 (since we are not using a powerful gpu like A100).
- change max gradient norm = 0.3.
- set batch size to 16
- LoRA configuration: usually it is set automatically by the source code so we will not change it. But to play with it you can increase LoRA Rank (intuition: the smaller the model is the higher the rank should be)
- click on preview command to see all all parameters
- click start and monitor the losses (losses will appear after a few minutes after the model is downloaded).

## Merge with the base model and push to you HuggingFace hub

In [None]:
!huggingface-cli login

In [None]:
import json

args = dict(
  model_name_or_path="mistralai/Mistral-7B-Instruct-v0.1", # use official non-quantized Llama-3-8B-Instruct model
  adapter_name_or_path="/content/LLaMA-Factory/saves/Mistral-7B-v0.1/lora/train_2024-12-26-10-43-37",            # load the saved LoRA adapters
  template="mistral",                     # same to the one in training
  finetuning_type="lora",                  # same to the one in training
  export_dir="llama3_lora_merged",              # the path to save the merged model
  export_size=1,                       # the file shard size (in GB) of the merged model
  export_device="cuda",                    # the device used in export, can be chosen from `cpu` and `cuda`
  export_hub_model_id="Hghanem96/Mistral_docker",         # the Hugging Face hub ID to upload model
)

json.dump(args, open("merge_mistral_docker.json", "w", encoding="utf-8"), indent=2)

%cd /content/LLaMA-Factory/

!llamafactory-cli export merge_mistral_docker.json

## Extra excercise: Try adding a new prompt template if the one of the model you finetune is not supported
- Go to src/llamafactory/data/template
- Templates are below in the code (ordered alphapetically by their name)
- Add your template or modify an existing one, for example modify the following:
```
_register_template(
    name="llama2_zh",
    format_user=StringFormatter(slots=[{"bos_token"}, "[INST] {{content}} [/INST]"]),
    format_system=StringFormatter(slots=["<<SYS>>\n{{content}}\n<</SYS>>\n\n"]),
    default_system="You are a helpful assistant. 你是一个乐于助人的助手。",
)
```