<br>

# 6) Instruction finetuning 

<br>

# 6.1 Preparing a dataset for supervised instruction finetuning

In [None]:
import json


file_path = "LLM-workshop-2024/06_finetuning/instruction-data.json"

with open(file_path, "r") as file:
    data = json.load(file)
print("Number of entries:", len(data))

In [None]:
print("Example entry:\n", data[50])

In [None]:
print("Another example entry:\n", data[999])

- Using Alpaca-style prompt formatting, which was the original prompt template for instruction finetuning


In [None]:
def format_input(entry):
    instruction_text = (
        f"Below is an instruction that describes a task. "
        f"Write a response that appropriately completes the request."
        f"\n\n### Instruction:\n{entry['instruction']}"
    )

    input_text = f"\n\n### Input:\n{entry['input']}" if entry["input"] else ""

    return instruction_text + input_text

- A formatted response with input field

In [None]:
model_input = format_input(data[50])
desired_response = f"\n\n### Response:\n{data[50]['output']}"

print(model_input + desired_response)

- A formatted response without an input field

In [None]:
model_input = format_input(data[999])
desired_response = f"\n\n### Response:\n{data[999]['output']}"

print(model_input + desired_response)

<br>

# 6.2 Creating training and test sets

In [None]:
import json


file_path = "LLM-workshop-2024/06_finetuning/instruction-data.json"

with open(file_path, "r") as file:
    data = json.load(file)
print("Number of entries:", len(data))

In [None]:
train_portion = int(len(data) * 0.85)  # 85% for training
test_portion = int(len(data) * 0.15)    # 15% for testing

train_data = data[:train_portion]
test_data = data[train_portion:]

In [None]:
print("Training set length:", len(train_data))
print("Test set length:", len(test_data))

In [None]:
with open("train.json", "w") as json_file:
    json.dump(train_data, json_file, indent=4)
    
with open("test.json", "w") as json_file:
    json.dump(test_data, json_file, indent=4)

<br>

# 6.3 Instruction finetuning

- Using LitGPT, we will use LoRA finetuning `litgpt finetune_lora model_dir` since it will be quicker and less resource intensive

In [None]:
!litgpt finetune_lora microsoft/phi-2 \
--data JSON \
--data.val_split_fraction 0.1 \
--data.json_path train.json \
--train.epochs 3 \
--train.log_interval 100

<br>

# 6.4) Instruction finetuning (evaluating instruction responses locally using a Llama 3 model)

- This notebook uses an 8 billion parameter Llama 3 model through LitGPT to evaluate responses of instruction finetuned LLMs based on a dataset in JSON format that includes the generated model responses, for example:



```python
{
    "instruction": "What is the atomic number of helium?",
    "input": "",
    "output": "The atomic number of helium is 2.",               # <-- The target given in the test set
    "response_before": "\nThe atomic number of helium is 3.0", # <-- Response by an LLM
    "response_after": "\nThe atomic number of helium is 2."    # <-- Response by a 2nd LLM
},
```

- The code doesn't require a GPU and runs on a laptop (it was tested on a M3 MacBook Air)

In [None]:
from importlib.metadata import version

pkgs = ["tqdm",    # Progress bar
        ]

for p in pkgs:
    print(f"{p} version: {version(p)}")

<br>

## 6.5 Load JSON Entries

In [None]:
import json

json_file = "test_response_before_after.json"

with open(json_file, "r") as file:
    json_data = json.load(file)

print("Number of entries:", len(json_data))

In [None]:
json_data[0]

In [None]:
def format_input(entry):
    instruction_text = (
        f"Below is an instruction that describes a task. Write a response that "
        f"appropriately completes the request."
        f"\n\n### Instruction:\n{entry['instruction']}"
    )

    input_text = f"\n\n### Input:\n{entry['input']}" if entry["input"] else ""
    instruction_text + input_text

    return instruction_text + input_text

print(format_input(json_data[0])) # input

In [None]:
json_data[0]["output"]

In [None]:
json_data[0]["response_before"]

- Using LitGPT to compare the model responses:

In [None]:
from litgpt import LLM

llm = LLM.load("meta-llama/Meta-Llama-3-8B-Instruct")

In [None]:
from tqdm import tqdm


def generate_model_scores(json_data, json_key):
    scores = []
    for entry in tqdm(json_data, desc="Scoring entries"):
        prompt = (
            f"Given the input `{format_input(entry)}` "
            f"and correct output `{entry['output']}`, "
            f"score the model response `{entry[json_key]}`"
            f" on a scale from 0 to 100, where 100 is the best score. "
            f"Respond with the integer number only."
        )
        score = llm.generate(prompt, max_new_tokens=50)
        try:
            scores.append(int(score))
        except ValueError:
            continue

    return scores