# Car Maintenance LLM Assistant — LoRA Fine-Tuning (Colab)

This notebook fine-tunes **TinyLlama-1.1B-Chat** (or Gemma-2B) on car maintenance Q&A using **LoRA** and **SFTTrainer**.

**Steps:**
1. Enable GPU: Runtime → Change runtime type → T4 GPU
2. Upload your `data/training/` folder (train.json) or clone the repo
3. Run all cells

## 1. Install dependencies

In [None]:
!pip install -q transformers datasets peft accelerate bitsandbytes trl

## 2. Upload training data

Upload your `train.json` (and optionally `val.json`) from `data/training/`. Or mount Drive and copy from your project folder.

In [11]:
# Option A: Upload files manually (run this cell, then use Colab file browser to upload data/training/train.json)
import os
os.makedirs("data/training", exist_ok=True)

# Option B: If you have project in Google Drive, mount and copy:
# from google.colab import drive
# drive.mount("/content/drive")
# !cp /content/drive/MyDrive/auto-maintenance-assistant/data/training/train.json data/training/

if os.path.exists("data/training/train.json"):
    print("train.json found.")
else:
    print("Upload train.json to data/training/ (e.g. drag-and-drop in file panel).")

Upload train.json to data/training/ (e.g. drag-and-drop in file panel).


## 3. Config and imports

In [12]:
import json
import torch
from pathlib import Path
from datasets import Dataset
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training, TaskType
from trl import SFTConfig, SFTTrainer

MODEL_ID = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"  # or "google/gemma-2b-it" with appropriate access
DATA_DIR = "data/training"
OUTPUT_DIR = "car-maintenance-llm"
MAX_SEQ_LENGTH = 512
LORA_R, LORA_ALPHA, LORA_DROPOUT = 16, 32, 0.05
BATCH_SIZE = 2
GRADIENT_ACCUMULATION = 4
LEARNING_RATE = 2e-4
NUM_EPOCHS = 2
WARMUP_RATIO = 0.03

## 4. Load dataset

In [13]:
# Find train.json: works when run from project root, from notebooks/, or in Colab after upload
_candidates = [
    Path("data/training/train.json"),                    # project root or Colab
    Path.cwd() / "data" / "training" / "train.json",
    Path.cwd().parent / "data" / "training" / "train.json",  # run from notebooks/
]
_train_path = None
for _p in _candidates:
    if _p.exists():
        _train_path = _p
        break
if _train_path is None:
    raise FileNotFoundError(
        "train.json not found. Run from project root: python scripts/prepare_training_data.py\n"
        "Then either run this notebook from project root, or upload data/training/train.json into data/training/."
    )
DATA_DIR = str(_train_path.parent)
with open(_train_path, "r", encoding="utf-8") as f:
    train_data = json.load(f)

def format_prompt(ex):
    inp = ex.get("input", "")
    if inp:
        return f"### Instruction:\n{ex['instruction']}\n\n### Input:\n{inp}\n\n### Response:\n{ex['output']}"
    return f"### Instruction:\n{ex['instruction']}\n\n### Response:\n{ex['output']}"

train_texts = [format_prompt(ex) for ex in train_data]
dataset = Dataset.from_dict({"text": train_texts})
print(f"Train examples: {len(dataset)} (from {DATA_DIR})")

Train examples: 28 (from /Users/patrickniyo/Documents/auto-maintenance-assistant/data/training)


## 5. Load model and LoRA

In [14]:
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
    torch_dtype=torch.float16,
)
model = prepare_model_for_kbit_training(model)

peft_config = LoraConfig(
    task_type=TaskType.CAUSAL_LM,
    r=LORA_R,
    lora_alpha=LORA_ALPHA,
    lora_dropout=LORA_DROPOUT,
    target_modules=["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    bias="none",
)
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()

Loading weights: 100%|██████████| 201/201 [02:01<00:00,  1.65it/s, Materializing param=model.norm.weight]                              


trainable params: 12,615,680 || all params: 1,112,664,064 || trainable%: 1.1338


## 6. Train (SFTTrainer)

In [17]:
# Use SFTConfig (trl 0.11+): SFT-specific args go here, not in SFTTrainer
from trl import SFTConfig, SFTTrainer

args = SFTConfig(
    output_dir=OUTPUT_DIR,
    per_device_train_batch_size=BATCH_SIZE,
    gradient_accumulation_steps=GRADIENT_ACCUMULATION,
    learning_rate=LEARNING_RATE,
    num_train_epochs=NUM_EPOCHS,
    warmup_ratio=WARMUP_RATIO,
    logging_steps=5,
    save_strategy="epoch",
    save_total_limit=2,
    fp16=True,
    report_to="none",
    remove_unused_columns=False,
    # SFT-specific (required in SFTConfig for newer trl)
    dataset_text_field="text",
    max_length=MAX_SEQ_LENGTH,
    packing=False,
)

# trl 0.12+: use processing_class instead of tokenizer
trainer = SFTTrainer(
    model=model,
    args=args,
    train_dataset=dataset,
    processing_class=tokenizer,
)

trainer.train()
trainer.save_model(OUTPUT_DIR)
tokenizer.save_pretrained(OUTPUT_DIR)
print(f"Model saved to {OUTPUT_DIR}")

warmup_ratio is deprecated and will be removed in v5.2. Use `warmup_steps` instead.
Adding EOS to train dataset: 100%|██████████| 28/28 [00:01<00:00, 25.32 examples/s]
Tokenizing train dataset: 100%|██████████| 28/28 [00:00<00:00, 683.75 examples/s]
Truncating train dataset: 100%|██████████| 28/28 [00:00<00:00, 8083.74 examples/s]
The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'pad_token_id': 2}.
  super().__init__(loader)


Step,Training Loss
5,2.160878


  super().__init__(loader)


Model saved to car-maintenance-llm


## 7. Export / download

Download the `car-maintenance-llm` folder (or zip it and download, or copy to Drive).

In [18]:
!zip -r car-maintenance-llm.zip car-maintenance-llm
print("Download the file car-maintenance-llm.zip from the Colab file browser (left panel).")

  adding: car-maintenance-llm/ (stored 0%)
  adding: car-maintenance-llm/checkpoint-8/ (stored 0%)
  adding: car-maintenance-llm/checkpoint-8/adapter_model.safetensors (deflated 61%)
  adding: car-maintenance-llm/checkpoint-8/rng_state.pth (deflated 26%)
  adding: car-maintenance-llm/checkpoint-8/tokenizer_config.json (deflated 46%)
  adding: car-maintenance-llm/checkpoint-8/optimizer.pt (deflated 100%)
  adding: car-maintenance-llm/checkpoint-8/scheduler.pt (deflated 62%)
  adding: car-maintenance-llm/checkpoint-8/tokenizer.json (deflated 85%)
  adding: car-maintenance-llm/checkpoint-8/README.md (deflated 65%)
  adding: car-maintenance-llm/checkpoint-8/training_args.bin (deflated 53%)
  adding: car-maintenance-llm/checkpoint-8/adapter_config.json (deflated 58%)
  adding: car-maintenance-llm/checkpoint-8/chat_template.jinja (deflated 60%)
  adding: car-maintenance-llm/checkpoint-8/scaler.pt (deflated 64%)
  adding: car-maintenance-llm/checkpoint-8/trainer_state.json (deflated 56%)
  ad