## Finetune the Model

## Environment Setup

This step uses the following libraries:
|Library|License|
|-|-|
| [PyTorch](https://github.com/pytorch/pytorch) | BSD 3-Clause |
| [python-dotenv](https://github.com/theskumar/python-dotenv) | BSD 3-Clause |
| [transformers](https://github.com/huggingface/transformers) | Apache 2.0 |
| [datasets](https://github.com/huggingface/datasets) | Apache 2.0 |
| [trl](https://github.com/huggingface/trl) | Apache 2.0 |
| [peft](https://github.com/huggingface/peft) | Apache 2.0 |
| [evaluate](https://github.com/huggingface/evaluate) | Apache 2.0 |
| [bert_score](https://github.com/Tiiiger/bert_score) | MIT |
| [numpy](https://numpy.org/about/) | Modified BSD |

In [1]:
import os
import json
from pathlib import Path
import numpy as np

import torch
from trl import DataCollatorForCompletionOnlyLM
from datasets import load_dataset
from transformers import (
    Seq2SeqTrainingArguments,
    Seq2SeqTrainer,
    AutoTokenizer,
    AutoModelForCausalLM,
    EarlyStoppingCallback)
from peft import LoraConfig, get_peft_model
import evaluate

In [2]:
DOCUMENT    = "FM5_0"
PDF_PATH    = Path("pdfs/raw/fm5-0.pdf")
BASE_MODEL  = Path("QuantFactory/Llama-3.2-1B-GGUF")
GGUF_FILE   = "Llama-3.2-1B.Q8_0.gguf"
CACHE_DIR   = "hf_cache"
DATA_DIR    = DOCUMENT / BASE_MODEL / "data"
MODEL_DIR   = DOCUMENT / BASE_MODEL / "lora"
CHUNKED_DATA = DATA_DIR / "chunked" / "chunked.jsonl"
QA_DATA      = DATA_DIR / "qa"       / "qa_pairs.jsonl"

os.environ["TOKENIZERS_PARALLELISM"] = "true"

Load the dataset and get the tokenizers ready.

In [3]:
tok              = AutoTokenizer.from_pretrained(MODEL_DIR)
# tok.pad_token    = "<|finetune_right_pad_id|>"
# tok.pad_token_id = tok.convert_tokens_to_ids(tok.pad_token)

Configure the model.

In [4]:
TEST_PORTION = 0.1
IGNORE_ID    = -100
MAX_LEN      = 1024

And set up the prompt with prompt builders.

In [5]:
sys_prompt = f" You are an FM-5-0 assistant. Concisely answer the following question."

In [6]:
sys_role = "system"
usr_role = "user"
bot_role = "assistant"

These are already in the tokenizer but being able to reference them will come in handy.

In [7]:
bos_tok      = "<|begin_of_text|>"
eot_id_tok   = "<|eot_id|>"
start_hd_tok = "<|start_header_id|>"
end_hd_tok   = "<|end_header_id|>"
eot_tok      = "<|end_of_text|>"

Define some functions to process the data so we can train on it.

In [8]:
def build_prompt(sys, context, usr, ans=None):
    prompt  = f"{bos_tok}"
    prompt += f"{start_hd_tok}{sys_role}{end_hd_tok}{context}{sys}{eot_id_tok}"
    prompt += f"{start_hd_tok}{usr_role}{end_hd_tok}{usr}{eot_id_tok}"
    prompt += f"{start_hd_tok}{bot_role}{end_hd_tok}"

    if ans is not None:
        prompt += f"{ans}{eot_id_tok}{eot_tok}"

    return prompt

In [9]:
def row_to_prompt(row):
    return {"text": build_prompt(sys_prompt, row['passage'], row['question'], ans=row['answer'])}

INow process the data. I'll use one sample to see how it's handled through the collator and evaluations.

In [10]:
# splits  = raw_ds.train_test_split(TEST_PORTION, seed=42)
raw_ds = load_dataset("json", data_files=QA_DATA.as_posix(), split="train")
print(raw_ds[100])

{'passage': "team members identify that a certain population group has a history of not participating in the election process. While knowing that a group does not participate is useful, the planning team understands and explains why the group does not participate. As the planning team maps out the various problems and related causes, it sees that some of the issues are symptoms of a bigger issue. In addition, the team discerns that some problems are outside the scope of their mission. Mapping helps isolate the root cause of problems that the operational approach must address. Figure 4-4 on page 68 is an example of relationship mapping that focuses on the military problems that could be used to further describe a problem frame.\n\n## Figure 4-4. Refined operational frame based on strategic frame\n\n4-61. The goal of problem framing is to identify obstacles impeding progress toward achieving the desired end state. Effective commanders and planning teams recognize that few problems are so

In [11]:
prompt_ds = raw_ds.map(row_to_prompt,
                       remove_columns=raw_ds.column_names
                       )
print(prompt_ds[100])

{'text': "<|begin_of_text|><|start_header_id|>system<|end_header_id|>team members identify that a certain population group has a history of not participating in the election process. While knowing that a group does not participate is useful, the planning team understands and explains why the group does not participate. As the planning team maps out the various problems and related causes, it sees that some of the issues are symptoms of a bigger issue. In addition, the team discerns that some problems are outside the scope of their mission. Mapping helps isolate the root cause of problems that the operational approach must address. Figure 4-4 on page 68 is an example of relationship mapping that focuses on the military problems that could be used to further describe a problem frame.\n\n## Figure 4-4. Refined operational frame based on strategic frame\n\n4-61. The goal of problem framing is to identify obstacles impeding progress toward achieving the desired end state. Effective commande

In [12]:
tok_ds    = prompt_ds.map(tok, batched=True, input_columns=["text"], remove_columns=prompt_ds.column_names)
print(tok_ds[100])

Map:   0%|          | 0/462 [00:00<?, ? examples/s]

{'input_ids': [128000, 128000, 128006, 9125, 128007, 9376, 3697, 10765, 430, 264, 3738, 7187, 1912, 706, 264, 3925, 315, 539, 24435, 304, 279, 6355, 1920, 13, 6104, 14392, 430, 264, 1912, 1587, 539, 16136, 374, 5505, 11, 279, 9293, 2128, 31869, 323, 15100, 3249, 279, 1912, 1587, 539, 16136, 13, 1666, 279, 9293, 2128, 14370, 704, 279, 5370, 5435, 323, 5552, 11384, 11, 433, 16008, 430, 1063, 315, 279, 4819, 527, 13803, 315, 264, 11493, 4360, 13, 763, 5369, 11, 279, 2128, 42645, 82, 430, 1063, 5435, 527, 4994, 279, 7036, 315, 872, 9131, 13, 39546, 8779, 43223, 279, 3789, 5353, 315, 5435, 430, 279, 25605, 5603, 2011, 2686, 13, 19575, 220, 19, 12, 19, 389, 2199, 220, 2614, 374, 459, 3187, 315, 5133, 13021, 430, 24400, 389, 279, 6411, 5435, 430, 1436, 387, 1511, 311, 4726, 7664, 264, 3575, 4124, 13, 198, 198, 567, 19575, 220, 19, 12, 19, 13, 432, 4094, 25605, 4124, 3196, 389, 19092, 4124, 198, 198, 19, 12, 5547, 13, 578, 5915, 315, 3575, 59049, 374, 311, 10765, 32116, 3242, 16490, 5208, 9017

In [13]:
tok_sample = tok_ds[100]
print("IDS IN   :", tok_sample["input_ids"][:40])
print("MASK     :", tok_sample["attention_mask"][:40])
print("TOKENS IN:", tok.convert_ids_to_tokens(tok_sample["input_ids"][:40]))
print("TOKENS IN:", tok.decode(tok_sample["input_ids"][:40], clean_up_tokenization_spaces=True))

IDS IN   : [128000, 128000, 128006, 9125, 128007, 9376, 3697, 10765, 430, 264, 3738, 7187, 1912, 706, 264, 3925, 315, 539, 24435, 304, 279, 6355, 1920, 13, 6104, 14392, 430, 264, 1912, 1587, 539, 16136, 374, 5505, 11, 279, 9293, 2128, 31869, 323]
MASK     : [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
TOKENS IN: ['<|begin_of_text|>', '<|begin_of_text|>', '<|start_header_id|>', 'system', '<|end_header_id|>', 'team', 'Ġmembers', 'Ġidentify', 'Ġthat', 'Ġa', 'Ġcertain', 'Ġpopulation', 'Ġgroup', 'Ġhas', 'Ġa', 'Ġhistory', 'Ġof', 'Ġnot', 'Ġparticipating', 'Ġin', 'Ġthe', 'Ġelection', 'Ġprocess', '.', 'ĠWhile', 'Ġknowing', 'Ġthat', 'Ġa', 'Ġgroup', 'Ġdoes', 'Ġnot', 'Ġparticipate', 'Ġis', 'Ġuseful', ',', 'Ġthe', 'Ġplanning', 'Ġteam', 'Ġunderstands', 'Ġand']
TOKENS IN: <|begin_of_text|><|begin_of_text|><|start_header_id|>system<|end_header_id|>team members identify that a certain population group has a history of not parti

So far so good, we have the desired prompt being tokenized and it de-tokenizes properly. Now I'll check the collator. I'm looking for the entire prompt to be ignored up to the actual assistant response.

In [14]:
collator = DataCollatorForCompletionOnlyLM(
    tokenizer            = tok,
    instruction_template = f"{start_hd_tok}{usr_role}{end_hd_tok}",
    response_template    = f"{start_hd_tok}{bot_role}{end_hd_tok}",
)

In [17]:
tok.pad_token = tok.eos_token

In [18]:
batch = collator([tok_sample])
for k,v in batch.items():
    print(k, v.shape, v[0][:40])

input_ids torch.Size([1, 570]) tensor([128000, 128000, 128006,   9125, 128007,   9376,   3697,  10765,    430,
           264,   3738,   7187,   1912,    706,    264,   3925,    315,    539,
         24435,    304,    279,   6355,   1920,     13,   6104,  14392,    430,
           264,   1912,   1587,    539,  16136,    374,   5505,     11,    279,
          9293,   2128,  31869,    323])
attention_mask torch.Size([1, 570]) tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
labels torch.Size([1, 570]) tensor([-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100,
        -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100,
        -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100,
        -100, -100, -100, -100])


First part looks good, all of the context that's injected is ignored.

In [20]:
for k,v in batch.items():
    print(k, v.shape, v[0][-40:])

input_ids torch.Size([1, 570]) tensor([  4320,    279,   2768,   3488,     13, 128009, 128006,    882, 128007,
          3923,    374,    279,   5915,    315,   3575,  59049,    304,   6411,
          7677,     30, 128009, 128006,  78191, 128007,   1271,  10765,  32116,
           430,   3242,  15686,   5208,   9017,  32145,    279,  12974,    842,
          1614,     13, 128009, 128001])
attention_mask torch.Size([1, 570]) tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
labels torch.Size([1, 570]) tensor([  -100,   -100,   -100,   -100,   -100,   -100,   -100,   -100,   -100,
          -100,   -100,   -100,   -100,   -100,   -100,   -100,   -100,   -100,
          -100,   -100,   -100,   -100,   -100,   -100,   1271,  10765,  32116,
           430,   3242,  15686,   5208,   9017,  32145,    279,  12974,    842,
          1614,     13, 128009, 128001])


And looks like the very end of this sample contains the actual tokens (non -100 values). I'll detokenize those to make sure the entire answer is included.

In [23]:
labels = batch["labels"][0].tolist()
last_mask_index = len(labels) - 1 - labels[::-1].index(IGNORE_ID)
masked_label = tok.decode(labels[last_mask_index + 1:], skip_special_tokens=True)
print(f"The collator only left this unmasked: {masked_label}")
print(f"Is only the answer unmasked?: {masked_label == raw_ds[100]["answer"]}")

The collator only left this unmasked: To identify obstacles that impede progress toward achieving the desired end state.
Is only the answer unmasked?: True


With data processing nailed down, I can split the data into a training and testing dataset and prepare for training. To create a more robust training cycle that leverages all data, I would utilize 10-fold cross-validation with 2 folds set to testing data while tuning the hyperparameters. After I'm happy with the hyperparameters, I'll train using all the data. For this though, I'm just going to use some typical good values for the hyperparameters.

In [24]:
splits     = tok_ds.train_test_split(TEST_PORTION, seed=42)
tok_train  = splits["train"]
tok_test   = splits["test"]

Now we load the model and the LoRA adapter.

Ideally this would be dead simple with SFTTrainer, but it doesn't support custom metrics yet (https://github.com/huggingface/trl/issues/862) so we have to do everything manually. I'm using gradient checkpointing just because I ran out of memory while training on my personal GPU.

In [25]:
base_model = AutoModelForCausalLM.from_pretrained(
            BASE_MODEL,
            cache_dir=CACHE_DIR,
            gguf_file=GGUF_FILE,
            device_map="auto",
            torch_dtype=torch.bfloat16)
base_model.gradient_checkpointing_enable()

Converting and de-quantizing GGUF tensors...:   0%|          | 0/147 [00:00<?, ?it/s]

In [26]:
lora_cfg = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM")
lora_model = get_peft_model(base_model, lora_cfg)
lora_model.print_trainable_parameters()  # sanity check

trainable params: 851,968 || all params: 1,236,666,368 || trainable%: 0.0689


Before training, I'll set up some metrics for evaluation.

 - F1: This is span-wise F1 (from SQUAD) shows how well the prediction and truth match if we treat them as a "bag of tokens".
 - Perplexity: I like to look at this over loss because you can interpret it as how "confident" the model is for the next token. E.g. a perplexity of ~2 means the model is considering bet
 - BERT Score: This is a good one to help understand how close the meaning of the output is to the label. Since it compares the BERT embeddings of the prediction and label, the embeddings of similar words are more closely aligned than disparate words.

There are some others I would like to use to gain as much insight as possible, but I omitted for simplicity here.

In [27]:
bert_metric   = evaluate.load("bertscore", cache_dir=CACHE_DIR)
squad_metric  = evaluate.load("squad", cache_dir=CACHE_DIR)

In [28]:
def compute_metrics(eval_preds) -> dict:
    preds  = eval_preds.predictions
    labels = eval_preds.label_ids
    losses = eval_preds.losses

    cleaned_labels = np.where(labels != IGNORE_ID, labels, tok.pad_token_id)
    cleaned_preds  = np.where(preds  != IGNORE_ID, preds,  tok.pad_token_id)

    decoded_preds  = tok.batch_decode(cleaned_preds.tolist(), skip_special_tokens=True)
    decoded_labels = tok.batch_decode(cleaned_labels.tolist(), skip_special_tokens=True)

    squad_preds = [
        {"id": str(i), "prediction_text": p}
        for i, p in enumerate(decoded_preds)
    ]
    squad_refs = [
        {
            "id": str(i),
            "answers": {"text": [decoded_labels[i]], "answer_start": [0]}
        }
        for i in range(len(decoded_labels))
    ]
    squad_results = squad_metric.compute(
        predictions=squad_preds,
        references=squad_refs
    )

    bert_results = bert_metric.compute(
        predictions=decoded_preds,
        references=decoded_labels,
        lang="en"
    )

    return {
        "perplexity":      np.mean(np.exp(losses)),
        "bert_precision":  np.mean(bert_results["precision"]),
        "bert_recall":     np.mean(bert_results["recall"]),
        "bert_f1":         np.mean(bert_results["f1"]),
        "qa_f1":           squad_results["f1"],
        "exact_match":     squad_results["exact_match"],
    }

All that's left is to set up the training loop and train the model.

In [29]:
args = Seq2SeqTrainingArguments(
    output_dir                  = MODEL_DIR,
    per_device_train_batch_size = 2,
    gradient_accumulation_steps = 32,
    num_train_epochs            = 10,
    learning_rate               = 2e-4,
    logging_steps               = 1,
    save_steps                  = 1,
    save_total_limit            = 10,
    neftune_noise_alpha         = 0.1,
    bf16                        = True,
    bf16_full_eval              = True,
    save_strategy               = "epoch",
    eval_strategy               = "epoch",
    report_to                   = "none",
    label_names                 = ["labels"],
    metric_for_best_model       = "eval_loss",
    load_best_model_at_end      = True,
    eval_on_start               = True,
    eval_accumulation_steps     = 10,
    include_for_metrics         = ["loss"],
    predict_with_generate       = True,
)

early_stopping = EarlyStoppingCallback(
    early_stopping_patience  = 1,
    early_stopping_threshold = 0.001,
)

trainer = Seq2SeqTrainer(
    model           = lora_model,
    args            = args,
    train_dataset   = tok_train,
    eval_dataset    = tok_test,
    data_collator   = collator,
    callbacks       = [early_stopping],
    compute_metrics = compute_metrics,
)

In [30]:
trainer.train()

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


Epoch,Training Loss,Validation Loss,Perplexity,Bert Precision,Bert Recall,Bert F1,Qa F1,Exact Match
0,No log,2.700835,15.037663,0.760618,0.891062,0.820571,9.709378,0.0
1,2.273800,2.07331,8.015541,0.760613,0.89113,0.820596,9.748886,0.0
2,1.583600,1.653955,5.258988,0.760613,0.890964,0.820526,9.841502,0.0
3,1.554800,1.463916,4.347284,0.760631,0.891087,0.820589,9.956081,0.0
4,1.236500,1.371908,3.969516,0.760611,0.891167,0.820611,9.983492,0.0
5,1.238300,1.303514,3.709529,0.760611,0.891167,0.820611,9.979784,0.0
6,1.138200,1.266672,3.577308,0.760629,0.891082,0.820585,9.98741,0.0
7,1.006200,1.244272,3.498294,0.760629,0.891082,0.820585,9.972797,0.0
8,1.225500,1.231085,3.451592,0.760629,0.891082,0.820585,9.968467,0.0


Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
Setting `pad_tok

TrainOutput(global_step=60, training_loss=1.4516207198301951, metrics={'train_runtime': 1036.015, 'train_samples_per_second': 4.006, 'train_steps_per_second': 0.058, 'total_flos': 1.218783667064832e+16, 'train_loss': 1.4516207198301951, 'epoch': 8.615384615384615})

Now I'll compare the base model to the lora model. Since the lora modifies the model in-place (even though I used a new variable), we need to load the base model again.

In [79]:
base_model_2 = AutoModelForCausalLM.from_pretrained(
            BASE_MODEL,
            cache_dir=CACHE_DIR,
            gguf_file=GGUF_FILE,
            device_map="auto",
            torch_dtype=torch.bfloat16)

Converting and de-quantizing GGUF tensors...:   0%|          | 0/147 [00:00<?, ?it/s]

Some parameters are on the meta device because they were offloaded to the cpu.


In [66]:
qa_data = []
with open(QA_DATA, "r", encoding="utf-8") as f:
    for line in f:
        qa_data.append(json.loads(line))

A quick test to see if the model internalized any of the data or just learned how to copy from the context. I'll do this by setting the context to nothing and asking a question that I know was in the training data.

In [68]:
question = qa_data[7]["question"]
answer   = qa_data[7]["answer"]
print(f"{question}\n{answer}")

What principles underpin effective planning in mission command?
Competence, shared understanding, mutual trust, mission orders, commander's intent, disciplined initiative, and risk acceptance.


In [80]:
context = ""
prompt =  build_prompt(sys_prompt, context, usr)

inputs = tok(prompt, return_tensors="pt").to(lora_model.device)
out = base_model_2.generate(**inputs,
                           max_new_tokens=256,
                           do_sample=True,
                           temperature=0.7,
                           top_p=0.9,
                           repetition_penalty=1.1,
                           no_repeat_ngram_size=4,
                           eos_token_id=tok.eos_token_id,
                           pad_token_id=tok.eos_token_id)
print(tok.decode(out[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))

nerRadiuser دهید IndexPath freopen書館edReaderedReader)readeredReaderedReaderedReaderedReaderreaderedReaderedReaderedReader(readeredReaderedReaderedReader readeredReaderedReaderedReader.readeredReaderedReaderedReader	readeredReaderedReaderedReader)reader(readeredReaderedReader)reader)reader(readeredReader)reader(reader)reader(reader)reader)reader(reader)readeredReader(reader)reader(reader(reader)reader(reader reader(reader)reader(reader.reader(reader(reader(reader(reader.reader(reader.reader(reader)reader(reader	reader(reader(reader(reader)reader)reader)reader(reader(reader(reader-reader(reader(reader(reader	reader(reader)reader(reader(read(reader(reader(reader(


In [81]:
context = ""
prompt =  build_prompt(sys_prompt, context, usr)

inputs = tok(prompt, return_tensors="pt").to(lora_model.device)
out = lora_model.generate(**inputs,
                           max_new_tokens=256,
                           do_sample=True,
                           temperature=0.7,
                           top_p=0.9,
                           repetition_penalty=1.1,
                           no_repeat_ngram_size=4,
                           eos_token_id=tok.eos_token_id,
                           pad_token_id = tok.eos_token_id)
print(tok.decode(out[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))

Planning involves identifying key objectives, assessing risks and constraints, and creating a comprehensive plan to achieve goals.　ヾ


In [82]:
base_out = base_model_2.generate(**inputs,
                               max_new_tokens=256,
                               do_sample=True,
                               temperature=0.7,
                               top_p=0.9,
                               repetition_penalty=1.1,
                               no_repeat_ngram_size=4,
                               eos_token_id=tok.eos_token_id,
                               pad_token_id = tok.eos_token_id)
print(tok.decode(base_out[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))

you are you prepared to be a part of this course of?
 JpaRepository_makeConstraints NSCoderwiąz


In [83]:
context  = qa_data[7]["passage"]
prompt   =  build_prompt(sys_prompt, context, usr)
inputs   = tok(prompt, return_tensors="pt").to(lora_model.device)
lora_out = lora_model.generate(**inputs,
                               max_new_tokens=256,
                               do_sample=True,
                               temperature=0.7,
                               top_p=0.9,
                               repetition_penalty=1.1,
                               no_repeat_ngram_size=4,
                               eos_token_id=tok.eos_token_id,
                               pad_token_id = tok.eos_token_id)
print(tok.decode(lora_out[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))

Competent decision makers, shared understanding, mutual trust, mission orders, commander's intent, disciplined initiatives, and risk acceptance..**************



In [74]:
final_model_path = MODEL_DIR / "final"
lora_model.save_pretrained(final_model_path.as_posix())