Now let's get to the fun part -- training a model. I'll start by installing the dependencies.

In [1]:
%pip install peft==0.5.0

!git clone https://github.com/OpenAccess-AI-Collective/axolotl
%pip install -e "./axolotl[flash-attn]"

[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.1.2[0m[39;49m -> [0m[32;49m23.2.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.10 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
fatal: destination path 'axolotl' already exists and is not an empty directory.
Obtaining file:///workspace/OpenPipe/examples/classify-recipes/axolotl
  Preparing metadata (setup.py) ... [?25ldone
[?25hCollecting transformers@ git+https://github.com/huggingface/transformers.git (from axolotl==0.1)
  Cloning https://github.com/huggingface/transformers.git to /tmp/pip-install-o3o9dk76/transformers_99ed72a1465e41bba173c85b3be82a1b
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers.git /tmp/pip-install-o3o9dk76/transformers_99ed72a1465e41bba173c85b3be82a1b
  Resolved https://github.com/huggingface/transformers

I'll use the [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) library to manage our training run. It includes a lot of neat tricks that speed up training without sacrificing quality.

In this case I'm using 8-bit training to use less GPU RAM, and sample packing to maximize GPU utilization. You can read more about the available options at https://github.com/OpenAccess-AI-Collective/axolotl.

The training run options are defined in [training-config.yaml](./training-config.yaml).

In [9]:
!accelerate launch ./axolotl/scripts/finetune.py training-config.yaml

The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `1`
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`

                           dP            dP   dP
                           88            88   88
.d8888b. dP.  .dP .d8888b. 88 .d8888b. d8888P 88
88'  `88  `8bd8'  88'  `88 88 88'  `88   88   88
88.  .88  .d88b.  88.  .88 88 88.  .88   88   88
`88888P8 dP'  `dP `88888P' dP `88888P'   dP   dP

[2023-08-24 04:29:56,887] [INFO] [axolotl.normalize_config:72] [PID:89149] GPU memory usage baseline: 0.000GB (+0.674GB misc)[39m
[2023-08-24 04:29:56,887] [INFO] [axolotl.scripts.train:189] [PID:89149] loading tokenizer... meta-llama/Llama-2-7b-hf[39m
[2023-08-24 04:29:57,058] [DEBUG] [axolotl.load_tokenizer:64] [PID:89149] EOS: 2 / </s>[39m
[2023-08-24 04:29:57,058] [DEBUG] [axolotl.load_tokenizer:65] [PID:89149] BOS

Sweet! If you look on your filesystem you should see a new directory `./models/run1`. This contains your trained model, which you can use to classify more recipes.

There's one more step though. I trained our model using [LoRA](https://huggingface.co/docs/peft/conceptual_guides/lora), which is a memory-efficient training method. But the inference library we'll use for testing doesn't support LoRA models directly yet, so we need to "merge" our LoRA model to transform it into a standard Llama2-shaped model. I've defined a small helper to do that called `merge_lora_model` that I'll use below.

In [2]:
from utils import merge_lora_model

print("Merging model (this could take a while)")
final_model_dir = merge_lora_model("training-config.yaml")
print(f"Final model saved to '{final_model_dir}'")


Loading base model


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Loading PEFT model
Running merge_and_unload
Model saved to ./models/recipe-model/merged
Final model saved to ./models/recipe-model/merged


Ok, I have a model, but is it actually any good? I'll run some evaluations in [./evaluate.ipynb](./evaluate.ipynb) to check.