<a href="https://colab.research.google.com/github/gabrielpacheco23/sft-llama3-1-8b-conops-stpa-20examples/blob/main/SFT_Llama3_1_8B_on_conops_stpa_20_examples_(mlabonne_guide).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
%%capture
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes

In [None]:
import torch
from trl import SFTTrainer
from datasets import load_dataset
from transformers import TrainingArguments, TextStreamer
from unsloth.chat_templates import get_chat_template
from unsloth import FastLanguageModel, is_bfloat16_supported

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.


In [None]:
max_seq_length = 2048
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/Meta-Llama-3.1-8B-bnb-4bit",
    max_seq_length=max_seq_length,
    load_in_4bit=True,
    dtype=None,
)

==((====))==  Unsloth 2024.11.3: Fast Llama patching. Transformers = 4.46.2.
   \\   /|    GPU: Tesla T4. Max memory: 14.748 GB. Platform = Linux.
O^O/ \_/ \    Pytorch: 2.5.0+cu121. CUDA = 7.5. CUDA Toolkit = 12.1.
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.28.post3. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.70G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/230 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/50.6k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/345 [00:00<?, ?B/s]

In [None]:
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    lora_alpha=16,
    lora_dropout=0,
    target_modules=["q_proj", "k_proj", "v_proj", "up_proj", "down_proj", "o_proj", "gate_proj"],
    use_rslora=True,
    use_gradient_checkpointing="unsloth"
)

Unsloth 2024.11.3 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


In [None]:
import pandas as pd
df = pd.read_csv('/content/conops_stpa_dataset_all.csv')

In [None]:
import re

# Define a function to clean markdown text
def clean_markdown(text):
    # Remove headers (e.g., # Header)
    text = re.sub(r"^#+\s+", "", text, flags=re.MULTILINE)

    # Remove images (e.g., ![alt text](url))
    text = re.sub(r"!\[.*?\]\(.*?\)", "", text)

    # Remove links (e.g., [text](url))
    text = re.sub(r"\[([^\]]+)\]\(.*?\)", r"\1", text)

    # Remove bold and italic (e.g., **text** or _text_)
    text = re.sub(r"(\*\*|__)(.*?)\1", r"\2", text)
    text = re.sub(r"(\*|_)(.*?)\1", r"\2", text)

    # Remove inline code (e.g., `code`)
    text = re.sub(r"`(.*?)`", r"\1", text)

    # Remove lists (e.g., - Item, * Item)
    text = re.sub(r"^(\*|\-|\+)\s+", "", text, flags=re.MULTILINE)

    # Remove horizontal rules (e.g., ---)
    text = re.sub(r"^-{3,}$", "", text, flags=re.MULTILINE)

    # Remove extra newlines and whitespace
    text = re.sub(r"\n{2,}", "\n", text).strip()

    return text

# Apply the function to each markdown text column
df["conops_text"] = df["conops_text"].apply(clean_markdown)
df["stpa_text"] = df["stpa_text"].apply(clean_markdown)

In [None]:
df.head()

Unnamed: 0,system_name,conops_source,stpa_source,conops_text,stpa_text
0,ventilator,ventilator_conops_gen.md,ventilator_stpa_gen.md,Concept of Operations (ConOps) for Ventilator ...,Based on the provided Concept of Operations (C...
1,defibrillator,defibrillator_conops_gen.md,defibrillator_stpa_gen.md,Concept of Operations (ConOps) for Defibrillat...,STPA Method: First Step Analysis for Defibrill...
2,elevator,elevator_conops_gen.md,elevator_stpa_gen.md,Concept of Operations (ConOps) for Elevator Sy...,Based on the Concept of Operations (ConOps) fo...
3,atc,atc_conops_gen.md,atc_stpa_gen.md,Concept of Operations (ConOps) for Automatic T...,"Losses (System-level, Undesirable Outcomes):\n..."
4,airliner,airliner_conops_gen.md,airliner_stpa_gen.md,Concept of Operations (ConOps)\nPage One - Sec...,Based on the provided Concept of Operations (C...


In [None]:
# prompt: generate code to prepare the dataset to make it like ShareGPT format

import pandas as pd

# Assuming df is your DataFrame with 'conops_text' and 'stpa_text' columns
def prepare_dataset(df):
    """Prepares the dataset in the ShareGPT format.

    Args:
        df: The input DataFrame.

    Returns:
        A list of dictionaries, where each dictionary represents a conversation turn
        in the ShareGPT format.
    """

    dataset = []
    for index, row in df.iterrows():
        conversation = []
        # Human turn: conops_text
        conversation.append({"from": "human", "value": row["conops_text"]})
        # Assistant turn: stpa_text
        conversation.append({"from": "gpt", "value": row["stpa_text"]})
        dataset.append({"conversations": conversation})
    return dataset

# Example usage (assuming 'df' is your DataFrame)
dataset = prepare_dataset(df)

# Now you can save the dataset to a JSON file, for example:
import json
with open('sharegpt_formatted_dataset.json', 'w') as f:
    json.dump(dataset, f, indent=4)

In [None]:
ds_new = load_dataset("json", data_files="sharegpt_formatted_dataset.json")
df = pd.DataFrame(ds_new)

Generating train split: 0 examples [00:00, ? examples/s]

In [None]:
tokenizer = get_chat_template(
    tokenizer,
    mapping={"role": "from", "content": "value", "user": "human", "assistant": "gpt"},
    # mapping={"content": "conops_text"}
    chat_template="chatml",
)

def apply_template(examples):
    messages = examples["conversations"]
    text = [tokenizer.apply_chat_template(message, tokenize=False, add_generation_prompt=False) for message in messages]
    return {"text": text}

dataset = load_dataset("json", data_files={"train": "sharegpt_formatted_dataset.json"})
dataset = dataset.map(apply_template, batched=True)

Unsloth: Will map <|im_end|> to EOS = <|end_of_text|>.


Map:   0%|          | 0/21 [00:00<?, ? examples/s]

In [None]:
trainer=SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset['train'],
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=2,
    # packing=True,
    args=TrainingArguments(
        learning_rate=3e-4,
        lr_scheduler_type="linear",
        per_device_train_batch_size=2,
        gradient_accumulation_steps=2,
        num_train_epochs=10,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=1,
        optim="adamw_8bit",
        weight_decay=0.01,
        warmup_steps=10,
        output_dir="output",
        seed=0,
    ),
)

trainer.train()


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 21 | Num Epochs = 10
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 2
\        /    Total batch size = 4 | Total steps = 50
 "-____-"     Number of trainable parameters = 41,943,040


Step,Training Loss
1,1.245
2,1.2773
3,1.263
4,1.3779
5,1.2838
6,1.8628
7,1.1836
8,1.1028
9,1.0554
10,1.1366


TrainOutput(global_step=50, training_loss=0.557299862653017, metrics={'train_runtime': 1092.8094, 'train_samples_per_second': 0.192, 'train_steps_per_second': 0.046, 'total_flos': 1.7712558895005696e+16, 'train_loss': 0.557299862653017, 'epoch': 9.090909090909092})

In [None]:
with open('/content/brake_conops_gen.md', 'r') as f:
    conops_inference = f.read()

conops_inference = clean_markdown(conops_inference)

In [None]:
model = FastLanguageModel.for_inference(model)

messages = [
    {"from": "human", "value": conops_inference},
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
).to("cuda")

text_streamer = TextStreamer(tokenizer)
_ = model.generate(inputs=inputs, streamer=text_streamer, max_new_tokens=128, use_cache=True)

In [None]:
from huggingface_hub import login

login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [None]:
model.save_pretrained_merged("model", tokenizer, save_method="merged_16bit")

In [None]:
model.push_to_hub_merged("pachequinho/stf-llama3.1-8b-conops-stpa-20-summfailed", tokenizer, save_method="merged_16bit")

Unsloth: You are pushing to hub, but you passed your HF username = pachequinho.
We shall truncate pachequinho/stf-llama3.1-8b-conops-stpa-20-summfailed to stf-llama3.1-8b-conops-stpa-20-summfailed


Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 1.48 out of 12.67 RAM for saving.


100%|██████████| 32/32 [06:49<00:00, 12.80s/it]


Unsloth: Saving tokenizer...

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

 Done.
Unsloth: Saving model... This might take 5 minutes for Llama-7b...
Unsloth: Saving stf-llama3.1-8b-conops-stpa-20-summfailed/pytorch_model-00001-of-00004.bin...
Unsloth: Saving stf-llama3.1-8b-conops-stpa-20-summfailed/pytorch_model-00002-of-00004.bin...
Unsloth: Saving stf-llama3.1-8b-conops-stpa-20-summfailed/pytorch_model-00003-of-00004.bin...
Unsloth: Saving stf-llama3.1-8b-conops-stpa-20-summfailed/pytorch_model-00004-of-00004.bin...


README.md:   0%|          | 0.00/31.0 [00:00<?, ?B/s]

pytorch_model-00001-of-00004.bin:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

pytorch_model-00002-of-00004.bin:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

pytorch_model-00004-of-00004.bin:   0%|          | 0.00/1.17G [00:00<?, ?B/s]

Upload 4 LFS files:   0%|          | 0/4 [00:00<?, ?it/s]

pytorch_model-00003-of-00004.bin:   0%|          | 0.00/4.92G [00:00<?, ?B/s]

Done.
Saved merged model to https://huggingface.co/pachequinho/stf-llama3.1-8b-conops-stpa-20-summfailed
