# üß† LLM Fine-Tuning: Fraud Detection
This notebook trains an LLM to detect fraudulent transactions using tabular data converted to natural-language prompts.




## üì¶ Install Dependencies

In [1]:
!pip install accelerate bitsandbytes datasets huggingface_hub  peft scikit-learn transformers trl



## üìö Libraries

In [32]:
from datasets import ClassLabel, concatenate_datasets,load_dataset, DatasetDict
from google.colab import userdata
from huggingface_hub import login, notebook_login
import json
import numpy as np
import os
import pandas as pd
from pathlib import Path
from peft import LoraConfig
import pickle
import plotly.express as px
from sklearn.metrics import (
    confusion_matrix,
    accuracy_score,
    precision_recall_curve,
    precision_recall_fscore_support,
    roc_auc_score,
    roc_curve,
    auc,
)
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline, TrainingArguments
from trl import SFTTrainer
import torch

## üîê Login to Hugging Face Hub

In [3]:
hf_token = os.environ.get('HF_Token') or userdata.get('HF_Token')

if hf_token:
    login(token=hf_token)
    print("HuggingFace login successful.")
else:
    print("HuggingFace token not found. Please set the HF_TOKEN environment variable or store it in Colab secrets.")



HuggingFace login successful.


## üìä Dataset Overview

### Dataset: Credit Card Fraud Detection

This dataset contains real anonymized credit card transactions labeled as **fraudulent (1)** or **legitimate (0)**.

Source: Kaggle ‚Üí mirrored to Hugging Face  
Rows: 284,807 transactions  
Fraud rate: ~0.17% (highly imbalanced)

---

### üß† What the columns mean

To protect privacy, the transaction features were **PCA-transformed**, meaning the raw customer and merchant details were converted into anonymized numeric components:

- `V1` to `V28`: Transformed anonymized features  
- `Amount`: Transaction amount  
- `Time`: Time since the first transaction in the dataset  
- `Class`:
  - `0` = legitimate transaction  
  - `1` = fraudulent transaction (positive class)

---

### üîç Goal

Train an LLM to predict whether a transaction is fraudulent **based on numeric financial signals**.

This is a binary classification problem:

| Output | Meaning |
|------|--------|
| `0` | Genuine transaction |
| `1` | Fraudulent transaction |

---

### ‚ö†Ô∏è Important Considerations

- **Severe class imbalance**  
  Only ~492 transactions are fraud out of ~285K.  
  Without techniques like resampling or class weighting, a model can cheat by predicting **all 0s** and still score ~99.8% accuracy.

- **Numeric ‚Üí Text Conversion**  
  LLMs don't understand tables ‚Äî so we convert each row into a structured **text prompt**.  
  Example:

Transaction features:
V1: -1.359
V2: 1.192
...
Amount: 149.62

Is this transaction fraudulent? Answer 0 or 1:


This turns tabular data into a natural-language classification task the LLM can learn from.

---

### ‚úÖ Why this dataset is good for fine-tuning

- Objective evaluation (F1, ROC-AUC > subjective chat quality)
- Real-world financial fraud use case
- Shows ability to inject **domain knowledge not built into base LLMs**
- Converts tabular ‚Üí language representations, demonstrating LLM versatility

---

### üéØ What success looks like

| Metric | Target |
|--------|--------|
Accuracy | Reasonable baseline, but misleading alone |
Precision | High reduces false fraud alerts |
Recall | High reduces missed fraud cases |
F1 | ‚úÖ Primary metric, balances precision/recall |
ROC-AUC | ‚úÖ Good for imbalanced classification |

We expect fine-tuning to improve **recall, F1, and ROC-AUC** significantly vs. the base model.


## üì• Load Fraud Dataset

In [4]:
# Load Dataset
# https://huggingface.co/datasets/David-Egea/Creditcard-fraud-detection
dataset = load_dataset("David-Egea/Creditcard-fraud-detection")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/188 [00:00<?, ?B/s]

creditcard.csv:   0%|          | 0.00/151M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/284807 [00:00<?, ? examples/s]

In [5]:
# Inspect dataset
print(f"\ndataset['train'] head:")
print(dataset["train"][:5])

print("\ndataset['train'] column names:")
print(dataset["train"].column_names)


dataset['train'] head:
{'Time': [0, 0, 1, 1, 2], 'V1': [-1.3598071336738, 1.19185711131486, -1.35835406159823, -0.966271711572087, -1.15823309349523], 'V2': [-0.0727811733098497, 0.26615071205963, -1.34016307473609, -0.185226008082898, 0.877736754848451], 'V3': [2.53634673796914, 0.16648011335321, 1.77320934263119, 1.79299333957872, 1.548717846511], 'V4': [1.37815522427443, 0.448154078460911, 0.379779593034328, -0.863291275036453, 0.403033933955121], 'V5': [-0.338320769942518, 0.0600176492822243, -0.503198133318193, -0.0103088796030823, -0.407193377311653], 'V6': [0.462387777762292, -0.0823608088155687, 1.80049938079263, 1.24720316752486, 0.0959214624684256], 'V7': [0.239598554061257, -0.0788029833323113, 0.791460956450422, 0.23760893977178, 0.592940745385545], 'V8': [0.0986979012610507, 0.0851016549148104, 0.247675786588991, 0.377435874652262, -0.270532677192282], 'V9': [0.363786969611213, -0.255425128109186, -1.51465432260583, -1.38702406270197, 0.817739308235294], 'V10': [0.0907941

## üß© Create Prompts from Tabular Data



In [6]:
TARGET = "Class"
FEATURES = [c for c in dataset["train"].column_names if c != TARGET]

def row_to_prompt(example):
    feat_text = "\n".join([f"{col}: {example[col]}" for col in FEATURES])
    prompt = f"""Transaction features:
{feat_text}

Is this transaction fraudulent? Answer 0 or 1:"""
    return {
        "prompt": prompt,
        "label": str(example[TARGET])  # keep label as string for text models
    }


## ‚úÇÔ∏è Train / Test Split


In [7]:
# üìå The original dataset only contains a 'train' split.

# Extract the raw dataset
full_ds = dataset["train"] if isinstance(dataset, DatasetDict) else dataset

# Ensure labels are numeric ints
full_ds = full_ds.map(lambda ex: {"Class": int(ex["Class"])}, batched=False)

# Converted the label column to ClassLabel format
label_schema = ClassLabel(num_classes=2, names=["0","1"])
full_ds = full_ds.cast_column("Class", label_schema)

# Create a stratified 80/20 train-test split to maintain the fraud ratio.
dataset = full_ds.train_test_split(
    test_size=0.2, seed=42, stratify_by_column="Class"
)

print(dataset)
print("Train fraud rate:",
      sum(dataset["train"]["Class"]) / len(dataset["train"]))
print("Test  fraud rate:",
      sum(dataset["test"]["Class"]) / len(dataset["test"]))

train_ds = dataset["train"].map(row_to_prompt)
test_ds  = dataset["test"].map(row_to_prompt)

Map:   0%|          | 0/284807 [00:00<?, ? examples/s]

Casting the dataset:   0%|          | 0/284807 [00:00<?, ? examples/s]

DatasetDict({
    train: Dataset({
        features: ['Time', 'V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10', 'V11', 'V12', 'V13', 'V14', 'V15', 'V16', 'V17', 'V18', 'V19', 'V20', 'V21', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27', 'V28', 'Amount', 'Class'],
        num_rows: 227845
    })
    test: Dataset({
        features: ['Time', 'V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10', 'V11', 'V12', 'V13', 'V14', 'V15', 'V16', 'V17', 'V18', 'V19', 'V20', 'V21', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27', 'V28', 'Amount', 'Class'],
        num_rows: 56962
    })
})
Train fraud rate: 0.001729245759178389
Test  fraud rate: 0.0017204452090867595


Map:   0%|          | 0/227845 [00:00<?, ? examples/s]

Map:   0%|          | 0/56962 [00:00<?, ? examples/s]

In [8]:
# Preview the new dataset
for i in range(2):
    print(train_ds[i]["prompt"])
    print("Label:", train_ds[i]["label"])
    print("-" * 60)

Transaction features:
Time: 78940
V1: -1.57573494412357
V2: 1.86421745236004
V3: 0.753429793934046
V4: 2.09506068569828
V5: 0.230530723738579
V6: 0.67872831318796
V7: -0.425229175822418
V8: 0.15837628974303
V9: -1.72469216101923
V10: 0.0355609317787195
V11: -1.25376692445179
V12: 0.221987738962449
V13: 1.16896940338198
V14: 0.629234046173798
V15: 0.92202566513489
V16: 0.537772051253491
V17: -0.035251075519164
V18: -0.270915272118166
V19: -0.24520466409349
V20: -0.338029362540826
V21: 0.808491069583672
V22: -0.453954305310836
V23: -0.187844676422405
V24: -0.760518310056324
V25: 0.421496279299155
V26: 0.136501507784844
V27: -0.313489765446064
V28: -0.196089937803906
Amount: 4.72

Is this transaction fraudulent? Answer 0 or 1:
Label: 0
------------------------------------------------------------
Transaction features:
Time: 30881
V1: -1.67283619428826
V2: 1.40129707385683
V3: 1.50393962832975
V4: 2.17549051030442
V5: 0.699791086586098
V6: 1.0621387170741
V7: 1.11436359165808
V8: -0.5358219

## üìä Baseline Model Evaluation (Before Fine-Tuning)

In [9]:
# Model
MODEL_ID = "Qwen/Qwen2-7B-Instruct"

# Load tokenizer to turn text <--> tokens
tok = AutoTokenizer.from_pretrained(MODEL_ID)

# Load model
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,                     # "Qwen/Qwen2-7B-Instruct"
    device_map="auto",            # Put model on GPU if available
    torch_dtype=torch.bfloat16    # Faster, lower-memory precision on modern GPUs.
)

# Build a text-generation pipeline
gen = pipeline(
    "text-generation",            # Create a text generator.  Feed prompts, get text answers
    model=model,                  # Qwen
    tokenizer=tok,                # Text <--> Tokens
    device_map="auto",            # Put model on GPU if available
    dtype=torch.bfloat16,         # Lower precision FP format
    max_new_tokens=2,             # we only need "0" or "1"
    do_sample=False,              # deterministic
)

# Helper function
def predict_label(prompt: str) -> int:
    """
    Helper function that takes a fraud-detection prompt, feeds it to the model,
    extracts the model's answer (Either 0 or 1).

    returns 0 or 1.
    """
    # Build a proper chat input
    msg = [{"role": "user", "content": prompt}]
    prompt_text = tok.apply_chat_template(msg, tokenize=False, add_generation_prompt=True)
    out = gen(prompt_text)[0]["generated_text"]
    # Pull the last 1‚Äì2 chars (0 or 1) in case the LLM's output is quirky.
    # Example: 'The answer is: 1' or '0 (not fraud)'
    tail = out[-5:].strip()
    return 1 if "1" in tail and "0" not in tail else 0

# Score a small slice of the test set
N_PREVIEW = 200
y_true = [int(test_ds[i]["label"]) for i in range(N_PREVIEW)]
# Produce predictions with the helper function
y_pred = [predict_label(test_ds[i]["prompt"]) for i in range(N_PREVIEW)]

# Compute objective metrics
acc = accuracy_score(y_true, y_pred)
prec, rec, f1, _ = precision_recall_fscore_support(y_true, y_pred, average="binary", zero_division=0)

print(f"Baseline (N={N_PREVIEW})  Acc={acc:.3f}  Prec={prec:.3f}  Rec={rec:.3f}  F1={f1:.3f}")

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

config.json:   0%|          | 0.00/663 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 4 files:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/3.95G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/3.86G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/3.56G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/3.86G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/243 [00:00<?, ?B/s]

Device set to use cuda:0
The following generation flags are not valid and may be ignored: ['temperature', 'top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


Baseline (N=200)  Acc=1.000  Prec=0.000  Rec=0.000  F1=0.000


## ‚öôÔ∏è Install + Reload Model in 4-bit (QLoRA-Ready)



In [10]:
# Load in 4-bit optimizations
bnb_cfg = BitsAndBytesConfig(
    load_in_4bit=True,                                  # Load model weights in 4-bit
    bnb_4bit_use_double_quant=True,                     # Second-stage quantization
    bnb_4bit_quant_type="nf4",                          # Use NF4 quantization
    bnb_4bit_compute_dtype=torch.bfloat16,              # Compute in Binary Float 16
)

# Load tokenizer to handle Text --> Tokens --> Text
tok = AutoTokenizer.from_pretrained(MODEL_ID)
tok.padding_side = "left"                               # Required for some decoder models + batching
if tok.pad_token is None:                               # Set PAD token to EOS if missing to prevent runtime errors during training
    tok.pad_token = tok.eos_token

# Load model with 4-bit config
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    quantization_config=bnb_cfg,                        # Apply the 4-bit config from above
    device_map={"": 0},                                 # Force everything that can fit to GPU 0
    trust_remote_code=True,                             # Needed for Qwen
)

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

## üß© Turn (prompt, label) pairs into chat examples

In [11]:
def as_chat(example):
    return {
        "messages": [
            {"role": "system", "content": "You are a precise fraud detector. Answer with 0 or 1 only."},
            {"role": "user",   "content": example["prompt"]},
            {"role": "assistant", "content": example["label"]},
        ]
    }


# Since positives are rare in this dataset, we have to balance the dataset so that the LLM can
# more accurately flag the positive class
pos = dataset["train"].filter(lambda ex: ex["Class"] == 1)
neg = dataset["train"].filter(lambda ex: ex["Class"] == 0)

# Don't fully balance to avoid overfitting the model
target_ratio = 0.10
k = max(1, int(target_ratio * len(neg) / len(pos)))

pos_up = concatenate_datasets([pos] * k)

reb_train = concatenate_datasets([neg, pos_up]).shuffle(seed=42)

# proceed to prompts/chat
small = reb_train.shuffle(seed=42).select(range(min(10_000, len(reb_train))))
small_ds   = small.map(row_to_prompt, remove_columns=small.column_names)
train_chat = small_ds.map(as_chat, remove_columns=small_ds.column_names)
# Do the same for eval
eval_small = dataset["test"].select(range(min(10_000, len(dataset["test"]))))
eval_small_ds = eval_small.map(row_to_prompt, remove_columns=eval_small.column_names)
eval_chat = eval_small_ds.map(as_chat, remove_columns=eval_small_ds.column_names)

# Preview
print(train_chat[0]["messages"])

Filter:   0%|          | 0/227845 [00:00<?, ? examples/s]

Filter:   0%|          | 0/227845 [00:00<?, ? examples/s]

Map:   0%|          | 0/10000 [00:00<?, ? examples/s]

Map:   0%|          | 0/10000 [00:00<?, ? examples/s]

Map:   0%|          | 0/10000 [00:00<?, ? examples/s]

Map:   0%|          | 0/10000 [00:00<?, ? examples/s]

[{'content': 'You are a precise fraud detector. Answer with 0 or 1 only.', 'role': 'system'}, {'content': 'Transaction features:\nTime: 122504\nV1: 2.0764617002505\nV2: -0.114192663660285\nV3: -1.396404288911\nV4: 0.253461576192227\nV5: 0.227932652287347\nV6: -0.769558407420263\nV7: 0.214656584129816\nV8: -0.308175139145632\nV9: 0.547828152074401\nV10: 0.0566790311476271\nV11: -1.36516676177282\nV12: 0.341826871119896\nV13: 0.482456802285992\nV14: 0.178538693730716\nV15: 0.0664784252197458\nV16: 0.103502009336066\nV17: -0.577035605762703\nV18: -0.654878529730773\nV19: 0.421200171498624\nV20: -0.140776285585276\nV21: -0.323220039510141\nV22: -0.825055770778101\nV23: 0.226055689065142\nV24: -0.707010716692866\nV25: -0.195857967715366\nV26: 0.239467702192428\nV27: -0.0736838321760188\nV28: -0.0634390622302762\nAmount: 21.99\n\nIs this transaction fraudulent? Answer 0 or 1:', 'role': 'user'}, {'content': '0', 'role': 'assistant'}]


## üõ†Ô∏è QLoRA config + Trainer

In [12]:
# Configure LoraConfig to tell PEFT how to apply LoRA.  LoRA trains small adapter matrices instead of doing the full model --> Drastically lower the memory and training cost.
peft_cfg = LoraConfig(
    r=16,                                                                                     # LoRA rank - Dimensionality of LoRA A/B matrices. Higher = more trainable params & capacity
    lora_alpha=32,                                                                            # Scaling factor, scales LoRA updates. Larger can help stability
    lora_dropout=0.05,                                                                        # Dropout probability for LoRA layers. Prevents overfitting; small because LoRA already regularizes
    bias="none",                                                                              # Ignore bias parameters
    task_type="CAUSAL_LM",                                                                    # Autoregressive language model
    target_modules=["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj"]    # Model weights to apply LoRA to
)

args = TrainingArguments(
    output_dir="./qLoRA-fraud",             # Where to save checkpoints
    num_train_epochs=1,                     # How many epochs to train for
    per_device_train_batch_size=4,          # Micro-batch size, small because QLoRA fits modest GPUs
    per_device_eval_batch_size=4,
    gradient_accumulation_steps=4,          # Effective batch size = 4 √ó 8 = 32
    learning_rate=2e-5,                     # Standard LoRA LR
    bf16=True,                              # Mixed precision (bfloat16)
    logging_steps=50,                       # Log every 50 steps
    # evaluation_strategy="epoch",            # Evaluate after each epoch
    save_strategy="epoch",                  # Save checkpoints per epoch
    lr_scheduler_type="cosine",             # Cosine learning rate decay
    warmup_ratio=0.05,                      # 5% warmup to prevent early instability
    gradient_checkpointing=True,            # Save memory during training
    report_to="none",                       # Disable logging to W&B
)

trainer = SFTTrainer(
    model=model,
    # tokenizer=tok,                                                    # Tokenizer matching the model
    peft_config=peft_cfg,                                               # Apply the LoRA settings defined above
    train_dataset=train_chat,                                           # Training dataset with formatted messages
    eval_dataset=eval_chat.select(range(min(5000, len(eval_chat)))),    # Eval subset (max 5000 items) to keep evaluation fast
    # dataset_text_field=None,                                          # Because we pass structured chat messages, not raw text field
    # max_seq_length=512,                                               # Max tokens in input sequences
    # packing=False,                                                    # Disable sequence packing; 1 sample per forward pass
    args=args,
)

Tokenizing train dataset:   0%|          | 0/10000 [00:00<?, ? examples/s]

Truncating train dataset:   0%|          | 0/10000 [00:00<?, ? examples/s]

Tokenizing eval dataset:   0%|          | 0/5000 [00:00<?, ? examples/s]

Truncating eval dataset:   0%|          | 0/5000 [00:00<?, ? examples/s]

In [13]:
# Check the tokenizer since the trainer model didn't provide an input for this
if tok.pad_token is None:
    tok.pad_token = tok.eos_token
    model.config.pad_token_id = tok.pad_token_id


In [14]:
# Trainer performance toggles
torch.backends.cuda.matmul.allow_tf32 = True
if hasattr(model, "enable_input_require_grads"):
    model.enable_input_require_grads()
model.config.use_cache = False           # needed with gradient checkpointing

## Test GPU

In [15]:
!nvidia-smi

Sun Nov  9 22:50:09 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A100-SXM4-40GB          Off |   00000000:00:04.0 Off |                    0 |
| N/A   31C    P0             53W /  400W |   21953MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                

In [16]:
print(torch.cuda.is_available())
print(torch.cuda.get_device_name(0))

True
NVIDIA A100-SXM4-40GB


In [17]:
print(next(model.parameters()).device)   # should be cuda:0 (some 4-bit layers show 'meta' but there should be CUDA params)
try:
    print(model.hf_device_map)          # should list layers on 'cuda:0'
except AttributeError:
    pass


cuda:0
{'': 0}


## üöÄ Fine-Tune Model with QLoRA

In [18]:
trainer.train()

The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'bos_token_id': None, 'pad_token_id': 151643}.
  return fn(*args, **kwargs)


Step,Training Loss
50,1.6411
100,1.4693
150,1.4663
200,1.4641
250,1.4608
300,1.459
350,1.4571
400,1.4531
450,1.4532
500,1.4513


TrainOutput(global_step=625, training_loss=1.472267694091797, metrics={'train_runtime': 3502.664, 'train_samples_per_second': 2.855, 'train_steps_per_second': 0.178, 'total_flos': 2.986059147236229e+17, 'train_loss': 1.472267694091797, 'entropy': 1.4510162556171418, 'num_tokens': 6942260.0, 'mean_token_accuracy': 0.429336784183979, 'epoch': 1.0})

## üíæ Save the LoRA adapter and push to HuggingFace Hub

In [19]:
adapter_dir = "fraud_qLoRA_adapter"
trainer.model.save_pretrained(adapter_dir)
tok.save_pretrained(adapter_dir)
print("Saved:", adapter_dir)

notebook_login()
trainer.model.push_to_hub("david125tran/fraud-qLoRA-adapter")
tok.push_to_hub("david125tran/fraud-qLoRA-adapter")

Saved: fraud_qLoRA_adapter


VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv‚Ä¶

README.md: 0.00B [00:00, ?B/s]

Processing Files (0 / 0)      : |          |  0.00B /  0.00B            

New Data Upload               : |          |  0.00B /  0.00B            

  ...adapter_model.safetensors:   0%|          | 49.4kB /  162MB            

Processing Files (0 / 0)      : |          |  0.00B /  0.00B            

New Data Upload               : |          |  0.00B /  0.00B            

  ...mpiy47gffw/tokenizer.json: 100%|##########| 11.4MB / 11.4MB            

No files have been modified since last commit. Skipping to prevent empty commit.


CommitInfo(commit_url='https://huggingface.co/david125tran/fraud-qLoRA-adapter/commit/9c25d1be277cf633d3da82646c077268d899fcb3', commit_message='Upload tokenizer', commit_description='', oid='9c25d1be277cf633d3da82646c077268d899fcb3', pr_url=None, repo_url=RepoUrl('https://huggingface.co/david125tran/fraud-qLoRA-adapter', endpoint='https://huggingface.co', repo_type='model', repo_id='david125tran/fraud-qLoRA-adapter'), pr_revision=None, pr_num=None)

## ‚úÖ Model Evaluation (After Fine-Tuning)

In [35]:
# 1) Ensure consistent dtype/device
trainer.model.to(torch.float16)  # OK for inference on most Colab GPUs
trainer.model.eval()

tok.padding_side = "left"
if tok.pad_token is None:
    tok.pad_token = tok.eos_token

param_dtype = next(trainer.model.parameters()).dtype
print("Eval device:", next(trainer.model.parameters()).device)
print("Eval dtype:", param_dtype)

# 2) Build a text-generation pipeline for the FT model
gen_ft = pipeline(
    "text-generation",
    model=trainer.model,
    tokenizer=tok,
    device_map="auto",
    torch_dtype=param_dtype,     # keep inputs & weights aligned
    max_new_tokens=2,
    do_sample=False,
)

# 3) Prepare chat-formatted prompts (use same system prompt as training)
N = min(5000, len(test_ds))
msgs = [
    [
        {"role": "system", "content": "You are a precise fraud detector. Answer with 0 or 1 only."},
        {"role": "user",   "content": test_ds[i]["prompt"]},
    ]
    for i in range(N)
]
prompt_texts = [tok.apply_chat_template(m, tokenize=False, add_generation_prompt=True) for m in msgs]

# 4) Run batched generation
outs = gen_ft(prompt_texts, batch_size=32)

# 5) Extract predictions from only the generated continuation
def extract_pred(out_text: str, prompt_text: str) -> int:
    gen_only = out_text[len(prompt_text):].strip()[:5]
    return 1 if ("1" in gen_only and "0" not in gen_only) else 0

y_pred = [extract_pred(o[0]["generated_text"], pt) for o, pt in zip(outs, prompt_texts)]
y_true = [int(test_ds[i]["label"]) for i in range(N)]

# 6) Metrics
acc = accuracy_score(y_true, y_pred)
prec, rec, f1, _ = precision_recall_fscore_support(
    y_true, y_pred, average="binary", zero_division=0
)

print(f"AFTER FT (N={N})  Acc={acc:.3f}  Prec={prec:.3f}  Rec={rec:.3f}  F1={f1:.3f}")
# ================================================

Device set to use cuda:0


Eval device: cuda:0
Eval dtype: torch.float16
AFTER FT (N=5000)  Acc=0.993  Prec=0.273  Rec=0.923  F1=0.421


## üìà Results Visualization (Confusion Matrix)

In [36]:
# Compute confusion matrix
cm = confusion_matrix(y_true, y_pred)

# Convert to DataFrame for Plotly
cm_df = pd.DataFrame(
    cm,
    index=['Actual: 0', 'Actual: 1'],
    columns=['Pred: 0', 'Pred: 1']
)

fig = px.imshow(
    cm_df,
    text_auto=True,
    color_continuous_scale='Blues',
    title="Confusion Matrix"
)
fig.update_layout(xaxis_title="Predicted Label", yaxis_title="True Label")
fig.show()

## üìà Results Visualization (ROC Curve)

In [37]:
def predict_label_and_score(prompt: str, mdl=None, tokenizer=None):
    import torch

    # 1) pick model + tokenizer
    mdl = trainer.model if mdl is None else mdl
    tok_local = tokenizer if tokenizer is not None else tok

    # 2) build input text
    if hasattr(tok_local, "apply_chat_template"):
        msg = [{"role": "user", "content": prompt}]
        input_text = tok_local.apply_chat_template(msg, tokenize=False, add_generation_prompt=True)
    else:
        input_text = prompt

    inputs = tok_local(input_text, return_tensors="pt")
    inputs = {k: v.to(mdl.device) for k, v in inputs.items()}

    # 3) pick AMP dtype from first param (works for PEFT/4-bit)
    try:
        first_dtype = next(mdl.parameters()).dtype
    except StopIteration:
        first_dtype = torch.float16

    mdl.eval()
    with torch.no_grad():
        if mdl.device.type == "cuda":
            amp_dtype = torch.bfloat16 if first_dtype == torch.bfloat16 else torch.float16
            with torch.autocast(device_type="cuda", dtype=amp_dtype):
                logits_last = mdl(**inputs).logits[0, -1]
        else:
            logits_last = mdl(**inputs).logits[0, -1]

    # 4) robust ids for "0"/"1" (with/without leading space)
    ids_zero, ids_one = [], []
    for s in ["0", " 0"]:
        ids = tok_local(s, add_special_tokens=False).input_ids
        if len(ids) == 1: ids_zero.append(ids[0])
    for s in ["1", " 1"]:
        ids = tok_local(s, add_special_tokens=False).input_ids
        if len(ids) == 1: ids_one.append(ids[0])
    if not ids_zero: ids_zero = [tok_local("0", add_special_tokens=False).input_ids[0]]
    if not ids_one:  ids_one  = [tok_local("1", add_special_tokens=False).input_ids[0]]

    z_logit = torch.stack([logits_last[i] for i in ids_zero]).max()
    o_logit = torch.stack([logits_last[i] for i in ids_one]).max()

    pair = torch.stack([z_logit, o_logit]).to(torch.float32)
    probs = torch.softmax(pair, dim=-1)
    p1 = probs[1].item()
    label = 1 if p1 > 0.5 else 0
    return label, p1



y_true = []
y_scores = []
y_pred = []

for i in range(N):
    true = int(test_ds[i]["label"])
    pred, score = predict_label_and_score(test_ds[i]["prompt"])

    y_true.append(true)
    y_pred.append(pred)
    y_scores.append(score)

fpr, tpr, _ = roc_curve(y_true, y_scores)
roc_auc = auc(fpr, tpr)

roc_df = pd.DataFrame({"FPR": fpr, "TPR": tpr})

fig = px.line(
    roc_df,
    x="FPR", y="TPR",
    title=f"ROC Curve (AUC = {roc_auc:.4f})"
)
fig.add_shape(
    type="line", line=dict(dash='dash'),
    x0=0, x1=1, y0=0, y1=1   # baseline diagonal
)
fig.update_layout(xaxis_title="False Positive Rate", yaxis_title="True Positive Rate")
fig.show()