<a href="https://colab.research.google.com/github/jyotidabass/Finetuning_Mistral_7b/blob/main/14.Finetuning_Mistral_7b_Using_AutoTrain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Fine-tuning Mistral 7b with AutoTrain

Setup Runtime
For fine-tuning Llama, a GPU instance is essential. Follow the directions below:

- Go to `Runtime` (located in the top menu bar).
- Select `Change Runtime Type`.
- Choose `T4 GPU` (or a comparable option).

### Step 1: Setup Environment

In [1]:
!pip install pandas autotrain-advanced -q

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m131.3/131.3 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m174.1/174.1 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m520.4/520.4 kB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m72.9/72.9 kB[0m [31m9.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.4/13.4 MB[0m [31m41.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m302.0/302.0 kB[0m [31m29.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.0/60.0 kB[0m [31m8.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m404.2/404.2 kB[0m [31m40.

In [2]:
!autotrain setup --update-torch

> [1mINFO    Installing latest xformers[0m
> [1mINFO    Successfully installed latest xformers[0m
> [1mINFO    Installing latest PyTorch[0m
> [1mINFO    Successfully installed latest PyTorch[0m


## Step 2: Connect to HuggingFace for Model Upload

### Logging to Hugging Face
To make sure the model can be uploaded to be used for Inference, it's necessary to log in to the Hugging Face hub.

### Getting a Hugging Face token
Steps:

1. Navigate to this URL: https://huggingface.co/settings/tokens
2. Create a write `token` and copy it to your clipboard
3. Run the code below and enter your `token`

In [3]:
from huggingface_hub import notebook_login
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

## Step 3: Upload your dataset

Add your data set to the root directory in the Colab under the name train.csv. The AutoTrain command will look for your data there under that name.

#### Don't have a data set and want to try finetuning on an example data set?
If you don't have a dataset you can run these commands below to get an example data set and save it to train.csv

In [4]:
!git clone https://github.com/joshbickett/finetune-llama-2.git
%cd finetune-llama-2
%mv train.csv ../train.csv
%cd ..

Cloning into 'finetune-llama-2'...
remote: Enumerating objects: 70, done.[K
remote: Counting objects: 100% (70/70), done.[K
remote: Compressing objects: 100% (50/50), done.[K
remote: Total 70 (delta 38), reused 48 (delta 19), pack-reused 0[K
Receiving objects: 100% (70/70), 25.13 KiB | 3.59 MiB/s, done.
Resolving deltas: 100% (38/38), done.
/content/finetune-llama-2
/content


In [5]:
import pandas as pd
df = pd.read_csv("train.csv")
df

Unnamed: 0,Concept,Funny Description Prompt,text
0,A cactus at a dance party,"A cactus, wearing disco lights and surrounded ...",###Human:\nGenerate a midjourney prompt for A ...
1,A robot on a first date,"A robot, with a bouquet of USB cables, nervous...",###Human:\nGenerate a midjourney prompt for A ...
2,A snail at a speed contest,"A snail, with a mini rocket booster, confident...",###Human:\nGenerate a midjourney prompt for A ...
3,A penguin at a beach party,"A penguin, with sunscreen and a surfboard, try...",###Human:\nGenerate a midjourney prompt for A ...
4,A cloud in a bad mood,"A cloud, grumbling and dropping mini lightning...",###Human:\nGenerate a midjourney prompt for A ...
...,...,...,...
112,A donut feeling the hole emptiness,"A donut, in existential bakery therapy, ponder...",###Human:\nGenerate a midjourney prompt for A ...
113,A pineapple with a prickly attitude,"A pineapple, in a prickly personality class, s...",###Human:\nGenerate a midjourney prompt for A ...
114,A calculator crunching life's problems,"A calculator, at a problem-solving workshop, c...",###Human:\nGenerate a midjourney prompt for A ...
115,A kite reaching new heights,"A kite, in an altitude adjustment session, unt...",###Human:\nGenerate a midjourney prompt for A ...


In [6]:
df['text'][15]

'###Human:\nGenerate a midjourney prompt for A book on a mystery adventure\n\n###Assistant:\nA book, wearing detective glasses, flipping through its own pages, trying to solve the cliffhanger it was left on.'

## Step 4: Overview of AutoTrain command

#### Short overview of what the command flags do.

- `!autotrain`: Command executed in environments like a Jupyter notebook to run shell commands directly. `autotrain` is an automatic training utility.

- `llm`: A sub-command or argument specifying the type of task

- `--train`: Initiates the training process.

- `--project_name`: Sets the name of the project

- `--model abhishek/llama-2-7b-hf-small-shards`: Specifies original model that is hosted on Hugging Face named "llama-2-7b-hf-small-shards" under the "abhishek".

- `--data_path .`: The path to the dataset for training. The "." refers to the current directory. The `train.csv` file needs to be located in this directory.

- `--use_int4`: Use of INT4 quantization to reduce model size and speed up inference times at the cost of some precision.

- `--learning_rate 2e-4`: Sets the learning rate for training to 0.0002.

- `--train_batch_size 12`: Sets the batch size for training to 12.

- `--num_train_epochs 3`: The training process will iterate over the dataset 3 times.

### Steps needed before running
Go to the `!autotrain` code cell below and update it by following the steps below:

1. After `--project_name` replace `*enter-a-project-name*` with the name that you'd like to call the project
2. After `--repo_id` replace `*username*/*repository*`. Replace `*username*` with your Hugging Face username and `*repository*` with the repository name you'd like it to be created under. You don't need to create this repository before hand, it will automatically be created and uploaded once the training is completed.
3. Confirm that `train.csv` is in the root directory in the Colab. The `--data_path .` flag will make it so that AutoTrain looks for your data there.
4. Make sure to add the LoRA Target Modules to be trained `--target-modules q_proj, v_proj`
5. Once you've made these changes you're all set, run the command below!

In [7]:
!autotrain llm --train --project_name mistral-7b-mj-finetuned --model bn22/Mistral-7B-Instruct-v0.1-sharded --data_path . --use_peft --use_int4 --learning_rate 2e-4 --train_batch_size 12 --num_train_epochs 3 --trainer sft --target_modules q_proj,v_proj --push_to_hub --repo_id ashishpatel26/mistral-7b-mj-finetuned

Traceback (most recent call last):
  File "/usr/local/bin/autotrain", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/autotrain/cli/autotrain.py", line 47, in main
    command = args.func(args)
  File "/usr/local/lib/python3.10/dist-packages/autotrain/cli/run_llm.py", line 14, in run_llm_command_factory
    return RunAutoTrainLLMCommand(args)
  File "/usr/local/lib/python3.10/dist-packages/autotrain/cli/run_llm.py", line 445, in __init__
    raise ValueError("Token must be specified for push to hub")
ValueError: Token must be specified for push to hub


## Step 5: Completed 🎉
After the command above is completed your Model will be uploaded to Hugging Face.

#### Learn more about AutoTrain (optional)
If you want to learn more about what command-line flags are available

## Step 6: Inference Engine

In [8]:
!autotrain llm -h

usage: autotrain <command> [<args>] llm [-h] [--train] [--deploy] [--inference]
                                        [--data_path DATA_PATH] [--train_split TRAIN_SPLIT]
                                        [--valid_split VALID_SPLIT] [--text_column TEXT_COLUMN]
                                        [--rejected_text_column REJECTED_TEXT_COLUMN]
                                        [--prompt-text-column PROMPT_TEXT_COLUMN] [--model MODEL]
                                        [--model-ref MODEL_REF] [--learning_rate LEARNING_RATE]
                                        [--num_train_epochs NUM_TRAIN_EPOCHS]
                                        [--train_batch_size TRAIN_BATCH_SIZE]
                                        [--warmup_ratio WARMUP_RATIO]
                                        [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS]
                                        [--optimizer OPTIMIZER] [--scheduler SCHEDULER]
                                      

In [9]:
!pip install -q peft  accelerate bitsandbytes safetensors

In [10]:
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import transformers
adapters_name = "ashishpatel26/mistral-7b-mj-finetuned"
model_name = "bn22/Mistral-7B-Instruct-v0.1-sharded" #"mistralai/Mistral-7B-Instruct-v0.1"


device = "cuda" # the device to load the model onto

In [11]:
bnb_config = transformers.BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

In [12]:
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    load_in_4bit=True,
    torch_dtype=torch.bfloat16,
    quantization_config=bnb_config,
    device_map='auto'
)

Downloading config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

Downloading (…)model.bin.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading (…)l_00001-of-00010.bin:   0%|          | 0.00/1.54G [00:00<?, ?B/s]

Downloading (…)l_00002-of-00010.bin:   0%|          | 0.00/1.43G [00:00<?, ?B/s]

Downloading (…)l_00003-of-00010.bin:   0%|          | 0.00/1.31G [00:00<?, ?B/s]

Downloading (…)l_00004-of-00010.bin:   0%|          | 0.00/1.83G [00:00<?, ?B/s]

Downloading (…)l_00005-of-00010.bin:   0%|          | 0.00/1.35G [00:00<?, ?B/s]

Downloading (…)l_00006-of-00010.bin:   0%|          | 0.00/1.35G [00:00<?, ?B/s]

Downloading (…)l_00007-of-00010.bin:   0%|          | 0.00/1.54G [00:00<?, ?B/s]

Downloading (…)l_00008-of-00010.bin:   0%|          | 0.00/1.43G [00:00<?, ?B/s]

Downloading (…)l_00009-of-00010.bin:   0%|          | 0.00/1.34G [00:00<?, ?B/s]

Downloading (…)l_00010-of-00010.bin:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Downloading (…)l_00011-of-00010.bin:   0%|          | 0.00/33.6M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

## Step 7: Peft Model Loading with upload model

In [13]:
model = PeftModel.from_pretrained(model, adapters_name)

Downloading adapter_config.json:   0%|          | 0.00/505 [00:00<?, ?B/s]

Downloading adapter_model.bin:   0%|          | 0.00/27.3M [00:00<?, ?B/s]

In [14]:
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.bos_token_id = 1

stop_token_ids = [0]

print(f"Successfully loaded the model {model_name} into memory")

Downloading tokenizer_config.json:   0%|          | 0.00/963 [00:00<?, ?B/s]

Downloading tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

Successfully loaded the model bn22/Mistral-7B-Instruct-v0.1-sharded into memory


In [16]:
text = "[INST] generate a midjourney prompt for four girls walks in the rain [/INST]"

encoded = tokenizer(text, return_tensors="pt", add_special_tokens=False)
model_input = encoded
model.to(device)
generated_ids = model.generate(**model_input, max_new_tokens=200, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


[INST] generate a midjourney prompt for four girls walks in the rain [/INST] Four girls walk in the rain, but as they reach the midpoint of their journey, they stumble upon a mysterious and enchanting forest. The rain has slowed to a gentle drizzle, and the forest is blanketed in lush greenery and sparkling dewdrops. The girls are filled with a sense of wonder and curiosity, and they decide to explore further into the forest. But as they walk deeper, the trees begin to twist and turn, and the forest becomes darker and more foreboding. What secrets does this mysterious forest hold, and what challenges will the girls face as they continue on their journey?</s>
