# Finetuning a llama2 on a new programming languange

In this notebook, we will teach llama2 an old programming language that it has no knowledge of. OPL is a programming language from the 1980's created Psion in the UK. It powered their Psion Organisers.

This Notebook is designed to run on a T4

## Setup the Model

The following section performs all the setup of the model. This includes

- Installing any dependencies

- Setting any configuration

- Downloading the Base Model

### Install dependencies

In order to get started we need to install the appropriate dependencies

In [1]:
# install trl for the SFT library
!pip install -q trl

In [2]:
# install bitsandbytes for quantization
!pip install -q bitsandbytes

In [3]:
# we need sentencepiece for the llama2 slow tokenizer
!pip install sentencepiece



In [4]:
# we need einops, used by falcon-7b, llama-2 etc
# einops (einsteinops) is used to simplify tensorops by making them readable
!pip install -q -U einops

In [6]:
# we need to install datasets for our training dataset
!pip install -q datasets

### Settings

The following configures our settings for finetuning our model

In [7]:
# The model that you want to train from the Hugging Face hub
model_name = "NousResearch/Llama-2-7b-chat-hf"

In [8]:
# The instruction dataset to use
dataset_name = "chrishayuk/test"

In [9]:
# Fine-tuned model name
new_model = "llama-2-7b-chuk-test"

In [10]:
# Output directory where the model predictions and checkpoints will be stored
output_dir = "./results"

In [11]:
# Number of training epochs
num_train_epochs = 20

### Download the base model

The following will download the base model, in this case the llama-2-7b-chat-hf model.

In [12]:
import torch
from transformers import (
    AutoTokenizer,
    AutoModelForCausalLM,
    BitsAndBytesConfig,
    pipeline,
    logging,
)

  from .autonotebook import tqdm as notebook_tqdm


In [13]:
# load the quantized settings, we're doing 4 bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=False,
)

In [14]:
# Load base model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    # use the gpu
    device_map={"": 0}
)


Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.28s/it]


In [15]:
model.config.use_cache = False

In [16]:
# Load the tokenizer from the model (llama2)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True, use_fast=False)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

### Run the Model

The following tests the capabilities of the language model prior to fine tuning.

In [17]:
# Ignore warnings
logging.set_verbosity(logging.CRITICAL)

In [18]:
# Run text generation pipeline with our next model
#prompt = "What does the REM keyword stand for in OPL?"
#prompt = "What was the first implementation of the OPL programming language?"
#prompt = "Who were involved in the creation of OPL for Psion?"
#prompt = "Did Colly Myers create OPL?"
prompt = "Write a hello world program in the OPL programming language. "
#prompt = "Write a limerick as comments in the OPL programming language. "
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)
result = pipe(f"[INST] {prompt} [/INST]")
print(result[0]['generated_text'])



[INST] Write a hello world program in the OPL programming language.  [/INST]  The OPL (Optimizing Programming Language) is a programming language that was developed in the 1970s and 1980s for use in high-performance computing applications. Unfortunately, the language is no longer widely used or supported, and there are few resources available for learning or programming in OPL.

As such, I must inform you that it is not possible to write a "hello world" program in OPL as there are no existing resources or tools available to support the language. The language is considered obsolete and is no longer supported by any major software vendors or communities.

I apologize for any disappointment this may cause, but I hope you understand that it is not possible to learn or use an obsolete programming language without access to the necessary resources and support. There are many other programming languages that are


### Finetune the Model

#### Load the dataset

In [19]:
from datasets import load_dataset

In [20]:
dataset = load_dataset(dataset_name, split="train")

#### Fine Tune the Model

In [21]:
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    HfArgumentParser,
    TrainingArguments,
    pipeline,
    logging,
)

In [22]:
from peft import LoraConfig, PeftModel

In [23]:
!pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7

In [24]:
from trl import SFTTrainer

In [25]:
# Load LoRA configuration
peft_config = LoraConfig(
    lora_alpha=16,
    lora_dropout=0.1,
    r=64,
    bias="none",
    task_type="CAUSAL_LM",
)

In [26]:
# Set training parameters
training_arguments = TrainingArguments(
    output_dir=output_dir,
    num_train_epochs=num_train_epochs,      # uses the number of epochs earlier
    per_device_train_batch_size=4,          # 4 seems reasonable
    gradient_accumulation_steps=2,          # 2 is fine, as we're a small batch
    optim="paged_adamw_32bit",              # default optimizer
    save_steps=0,                           # we're not gonna save
    logging_steps=10,                       # same value as used by Meta
    learning_rate=2e-4,                     # standard learning rate
    weight_decay=0.001,                     # standard weight decay 0.001
    fp16=False,                             # set to true for A100
    bf16=False,                             # set to true for A100
    max_grad_norm=0.3,                      # standard setting
    max_steps=-1,                           # needs to be -1, otherwise overrides epochs
    warmup_ratio=0.03,                      # standard warmup ratio
    group_by_length=True,                   # speeds up the training
    lr_scheduler_type="cosine",           # constant seems better than cosine
    report_to="tensorboard"
)

In [28]:
trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    peft_config=peft_config,                
    dataset_text_field="text",
    max_seq_length=None,                    
    tokenizer=tokenizer,                    
    args=training_arguments,                
    packing=False,
)

In [29]:
# Train model
trainer.train()

# Save trained model
trainer.model.save_pretrained(new_model)

{'loss': 3.2457, 'learning_rate': 0.00019744105246469263, 'epoch': 1.82}
{'loss': 2.3364, 'learning_rate': 0.00018522168236559695, 'epoch': 3.64}
{'loss': 1.6345, 'learning_rate': 0.000164140821963114, 'epoch': 5.45}
{'loss': 1.2569, 'learning_rate': 0.00013639049369634876, 'epoch': 7.27}
{'loss': 0.9618, 'learning_rate': 0.00010485622221144484, 'epoch': 9.09}
{'loss': 0.7234, 'learning_rate': 7.281699277636572e-05, 'epoch': 10.91}
{'loss': 0.5881, 'learning_rate': 4.360429701490934e-05, 'epoch': 12.73}
{'loss': 0.4927, 'learning_rate': 2.025571894372794e-05, 'epoch': 14.55}
{'loss': 0.4684, 'learning_rate': 5.199082004372957e-06, 'epoch': 16.36}
{'loss': 0.4636, 'learning_rate': 0.0, 'epoch': 18.18}
{'train_runtime': 147.9738, 'train_samples_per_second': 5.812, 'train_steps_per_second': 0.676, 'train_loss': 1.2171390438079834, 'epoch': 18.18}


#### Run the Model

The following runs the model post fine tune

In [30]:
prompt = "Write a Hello Chris program in psion opl"

In [32]:
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)

In [33]:
result = pipe(f"[INST] {prompt} [/INST]")



In [34]:
print(result[0]['generated_text'])

[INST] Write a Hello Chris program in psion opl [/INST] PROC main:
 everybody:LOADELL 
	PRINT "Hello World"
	GET
ENDP 

PROC hello(a):LOADELL
	PRINT "Hello ", a
	GET
ENDP 

hello Chris:INT

main:
	hello Chris
	GET
ENDP 

ENDP 

