# 💧 LFM2 - SFT with TRL

This tutorial demonstrates how to fine-tune our LFM2 models, e.g. [`LiquidAI/LFM2-1.2B`](https://huggingface.co/LiquidAI/LFM2-1.2B), using the TRL library.

Follow along if it's your first time using trl, or take single code snippets for your own workflow

## 🎯 What you'll find:
- **SFT** (Supervised Fine-Tuning) - Basic instruction following
- **LoRA + SFT** - Using LoRA (from PEFT) to SFT while on constrained hardware

## 📋 Prerequisites:
- **GPU Runtime**: Select GPU in `Runtime` → `Change runtime type`
- **Hugging Face Account**: For accessing models and datasets



# 📦 Installation & Setup

First, let's install all the required packages:


In [None]:
!pip install transformers==4.54.1 trl>=0.18.2 peft>=0.15.2

In [None]:
!pip install sentencepiece --upgrade

In [None]:
#!pip install patchelf

In [None]:
#!patchelf --add-rpath '$ORIGIN/../../nvidia/cusparse/lib' /usr/local/lib/python3.11/dist-packages/torch/lib/libtorch_cuda.so

Let's now verify the packages are installed correctly

In [None]:
import torch
import transformers
import trl
import os
os.environ["WANDB_DISABLED"] = "true"

print(f"📦 PyTorch version: {torch.__version__}")
print(f"🤗 Transformers version: {transformers.__version__}")
print(f"📊 TRL version: {trl.__version__}")

# Loading the model from Transformers 🤗



In [1]:
from transformers import AutoTokenizer, AutoModelForCausalLM
from IPython.display import display, HTML, Markdown
import torch

model_id = "LiquidAI/LFM2-350M" # <- or LFM2-700M or LFM2-350M

print("📚 Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(model_id)

print("🧠 Loading model...")
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype="bfloat16",
#   attn_implementation="flash_attention_2" <- uncomment on compatible GPU
)

print("✅ Local model loaded successfully!")
print(f"🔢 Parameters: {model.num_parameters():,}")
print(f"📖 Vocab size: {len(tokenizer)}")
print(f"💾 Model size: ~{model.num_parameters() * 2 / 1e9:.1f} GB (bfloat16)")

📚 Loading tokenizer...
🧠 Loading model...


  warn(
2025-08-19 00:26:28.995179: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-08-19 00:26:29.033924: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-08-19 00:26:29.033969: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-08-19 00:26:29.035101: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-08-19 00:26:29.042173: I tensorflow/core/platform/cpu_feat

✅ Local model loaded successfully!
🔢 Parameters: 354,483,968
📖 Vocab size: 64400
💾 Model size: ~0.7 GB (bfloat16)


# 🎯 Part 1: Supervised Fine-Tuning (SFT)

SFT teaches the model to follow instructions by training on input-output pairs (instruction vs response). This is the foundation for creating instruction-following models.

## Load an SFT Dataset

We will use [HuggingFaceTB/smoltalk](https://huggingface.co/datasets/HuggingFaceTB/smoltalk), limiting ourselves to the first 5k samples for brevity. Feel free to change the limit by changing the slicing index in the parameter `split`.

In [2]:
from datasets import load_dataset

print("📥 Loading SFT dataset...")
train_dataset_sft = load_dataset("oddadmix/arabic-news-summarization", split="train")
eval_dataset_sft = load_dataset("oddadmix/arabic-news-summarization", split="test")


def filterEmpty(example):
    if example["summary_text_translated"] is None or example["origin_text_translated"] is None: 
        return False
    return True

train_dataset_sft = train_dataset_sft.filter(filterEmpty)
eval_dataset_sft = eval_dataset_sft.filter(filterEmpty)


def convert_to_conversation(example):
    example["messages"] = [
        {
        "content": "Summarize the following text: \n\n " + example["origin_text_translated"] ,
        "role": "user"
        },
        {
        "content": example["summary_text_translated"],
        "role": "assistant"
        }
        ]
    return example

train_dataset_sft = train_dataset_sft.map(convert_to_conversation, remove_columns=["origin_text_translated", "origin_text", "summary_text_translated","summary_text"])
eval_dataset_sft = eval_dataset_sft.map(convert_to_conversation, remove_columns=["origin_text_translated", "origin_text", "summary_text_translated","summary_text"])

print("✅ SFT Dataset loaded:")
print(f"   📚 Train samples: {len(train_dataset_sft)}")
print(f"   🧪 Eval samples: {len(eval_dataset_sft)}")
print(f"\n📝 Single Sample: {train_dataset_sft[0]['messages']}")

📥 Loading SFT dataset...


README.md:   0%|          | 0.00/716 [00:00<?, ?B/s]

data/train-00000-of-00001.parquet:   0%|          | 0.00/92.1M [00:00<?, ?B/s]

data/test-00000-of-00001.parquet:   0%|          | 0.00/928k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/17862 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/181 [00:00<?, ? examples/s]

Filter:   0%|          | 0/17862 [00:00<?, ? examples/s]

Filter:   0%|          | 0/181 [00:00<?, ? examples/s]

Map:   0%|          | 0/17862 [00:00<?, ? examples/s]

Map:   0%|          | 0/181 [00:00<?, ? examples/s]

✅ SFT Dataset loaded:
   📚 Train samples: 17862
   🧪 Eval samples: 181

📝 Single Sample: [{'content': 'Summarize the following text: \n\n بقلم: وليام تورفيل. نشر في: 08:04 صباحًا بتوقيت شرق الولايات المتحدة، 21 ديسمبر 2013 | تم تحديثه: 09:07 صباحًا بتوقيت شرق الولايات المتحدة، 21 ديسمبر 2013. إذا كنت تدخل في الروح المناسبة لعيد الميلاد بمجرد سماع أغنية عيد الميلاد أو رؤية صفوف من الزخارف للبيع، فمن الأفضل أن تبتعد عن هذه الشوارع. قد تبدو شوارع مثل تينسيل لين في نونيتون، وارويكشاير، غريبة في الصيف - لكنها تأخذ مكانها في ديسمبر. وجد بحث موظفي رويال ميل أن المملكة المتحدة لديها 3369 اسمًا لشوارع مرتبطة بعيد الميلاد، بناءً على تحليل كل اسم شارع في البلاد. تينسيل لين في نونيتون، وارويكشاير، مزينة هذا العام من قبل السكان ميشيل أورتون وابنها ذي الثلاث سنوات، ماثيو. تعرض الأخوات نيام ديفي (على اليسار)، البالغة من العمر ثماني سنوات، وسيارا ديفي، البالغة من العمر خمس سنوات، أمام اسم شارعهم - الذي قد يبدو غريبًا بعض الشيء في الصيف - في نورثامبتون. هولي ستريت هي أكثر الأسماء شيوعًا لشوارع عيد المي

## Launch Training

We are now ready to launch an SFT run with `SFTTrainer`, feel free to modify `SFTConfig` to play around with different configurations.



In [None]:
from trl import SFTConfig, SFTTrainer

sft_config = SFTConfig(
    output_dir="./lfm2-sft-summary",
    num_train_epochs=10,
    per_device_train_batch_size=16,
    learning_rate=5e-5,
    lr_scheduler_type="linear",
    warmup_steps=100,
    warmup_ratio=0.2,
    logging_steps=10,
    save_strategy="epoch",
    eval_strategy="epoch",
    load_best_model_at_end=True,
    report_to=None,
    bf16=False # <- not all colab GPUs support bf16
)

print("🏗️  Creating SFT trainer...")
sft_trainer = SFTTrainer(
    model=model,
    args=sft_config,
    train_dataset=train_dataset_sft,
    eval_dataset=eval_dataset_sft,
    processing_class=tokenizer,
)

print("\n🚀 Starting SFT training...")
sft_trainer.train(resume_from_checkpoint=True)

print("🎉 SFT training completed!")

sft_trainer.push_to_hub("oddadmix/arabic-summarization")
#sft_trainer.save_model()
print(f"💾 SFT model saved to: {sft_config.output_dir}")

🏗️  Creating SFT trainer...


Tokenizing train dataset:   0%|          | 0/17862 [00:00<?, ? examples/s]

Truncating train dataset:   0%|          | 0/17862 [00:00<?, ? examples/s]

Tokenizing eval dataset:   0%|          | 0/181 [00:00<?, ? examples/s]

Truncating eval dataset:   0%|          | 0/181 [00:00<?, ? examples/s]

[2025-08-19 00:27:18,883] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)

🚀 Starting SFT training...


There were missing keys in the checkpoint model loaded: ['lm_head.weight'].
[34m[1mwandb[0m: Currently logged in as: [33mahmed-m-wasfy[0m. Use [1m`wandb login --relogin`[0m to force relogin


Epoch,Training Loss,Validation Loss
