# Experiment 2: Fine-Tuning with QLoRA
## Solana Smart Contract Vulnerability Detection

---

### Objective
Fine-tune LLaMA-3.1-8B-Instruct using QLoRA for binary vulnerability classification.

### Method
- **QLoRA**: Quantized Low-Rank Adaptation enables fine-tuning 8B models on consumer GPUs
- **DataCollatorForCompletionOnlyLM**: Computes loss ONLY on classification output, preventing memorization

### Key Difference from Zero-Shot
| Aspect | Zero-Shot | Fine-Tuning |
|--------|-----------|-------------|
| Training | None | 3 epochs on domain data |
| Adaptation | Prompt engineering | Weight updates via LoRA |
| Domain Knowledge | General | Solana-specific |

### References
- Dettmers et al. (2023). *QLoRA: Efficient Finetuning of Quantized LLMs*. arXiv:2305.14314
- Hu et al. (2021). *LoRA: Low-Rank Adaptation of Large Language Models*. arXiv:2106.09685

---

## 1. Environment Setup

Verify GPU availability and install required packages.

In [None]:
import torch
import os
import warnings

# Clean output
warnings.filterwarnings('ignore')
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
os.environ['TRANSFORMERS_VERBOSITY'] = 'error'

print("=" * 60)
print("ENVIRONMENT CHECK")
print("=" * 60)

if torch.cuda.is_available():
    GPU_NAME = torch.cuda.get_device_name(0)
    GPU_MEMORY = torch.cuda.get_device_properties(0).total_memory / 1e9
    print(f"✓ GPU: {GPU_NAME}")
    print(f"✓ VRAM: {GPU_MEMORY:.1f} GB")
    print(f"✓ PyTorch: {torch.__version__}")
    print(f"✓ CUDA: {torch.version.cuda}")
else:
    raise RuntimeError("GPU not detected! Enable GPU: Settings → Accelerator → GPU T4 x2")

In [None]:
%%capture
# Install packages silently (compatible versions)
!pip install -q bitsandbytes accelerate
!pip install -q peft==0.9.0
!pip install -q trl==0.12.0

In [None]:
print("✓ Packages installed successfully")

## 2. Imports & Authentication

Load libraries and authenticate with HuggingFace for model access.

In [None]:
import json
import logging
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from collections import defaultdict, Counter
from sklearn.model_selection import train_test_split
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score, 
    confusion_matrix, classification_report
)
from tqdm import tqdm
from datasets import Dataset
from huggingface_hub import login

# Suppress verbose logging
for logger_name in ["transformers", "accelerate", "peft", "trl", "bitsandbytes"]:
    logging.getLogger(logger_name).setLevel(logging.ERROR)

# Authentication
from kaggle_secrets import UserSecretsClient
HF_TOKEN = UserSecretsClient().get_secret("HF_TOKEN")
login(token=HF_TOKEN, add_to_git_credential=False)

print("✓ Authentication successful")

## 3. Load Dataset

Load the Solana vulnerability dataset containing 140 samples across 7 vulnerability types.

In [None]:
# Dataset path configuration
POSSIBLE_PATHS = [
    "/kaggle/input/solana-dataset/solana_140s_final.json",
    "/kaggle/input/solana-vuln-dataset/solana_140s_final.json",
    "/kaggle/input/solana_140s_final.json",
]

DATASET_PATH = None
for path in POSSIBLE_PATHS:
    if os.path.exists(path):
        DATASET_PATH = path
        break

if DATASET_PATH is None:
    raise FileNotFoundError("Dataset not found. Please upload solana_140s_final.json")

with open(DATASET_PATH, 'r') as f:
    dataset = json.load(f)

# Dataset statistics
print("=" * 60)
print("DATASET STATISTICS")
print("=" * 60)
print(f"Total samples: {len(dataset)}")
print(f"Source: {DATASET_PATH}")
print("\nVulnerability Distribution:")
for vtype, count in sorted(Counter(s['vulnerability_type'] for s in dataset).items()):
    print(f"  {vtype}: {count}")

label_dist = Counter(s['label'] for s in dataset)
print(f"\nLabel Distribution:")
print(f"  VULNERABLE: {label_dist['VULNERABLE']}")
print(f"  SAFE: {label_dist['SAFE']}")

## 4. Data Preparation

### Why Stratified Split by Vulnerability Type?

Each vulnerability type has unique patterns. Stratified splitting ensures:
1. All vulnerability types appear in train/val/test
2. Label balance is maintained within each type
3. Evaluation reflects real-world distribution

In [None]:
# Stratified split by vulnerability type
by_vuln_type = defaultdict(list)
for sample in dataset:
    by_vuln_type[sample['vulnerability_type']].append(sample)

train_data, val_data, test_data = [], [], []

for vtype, samples in by_vuln_type.items():
    labels = [s['label'] for s in samples]
    
    # 80% train, 20% temp
    train_samples, temp_samples = train_test_split(
        samples, test_size=0.2, stratify=labels, random_state=42
    )
    
    # Split temp 50/50 into val and test
    temp_labels = [s['label'] for s in temp_samples]
    val_samples, test_samples = train_test_split(
        temp_samples, test_size=0.5, stratify=temp_labels, random_state=42
    )
    
    train_data.extend(train_samples)
    val_data.extend(val_samples)
    test_data.extend(test_samples)

print("=" * 60)
print("DATA SPLIT")
print("=" * 60)
print(f"Training:   {len(train_data):3d} samples (80%)")
print(f"Validation: {len(val_data):3d} samples (10%)")
print(f"Test:       {len(test_data):3d} samples (10%)")

In [None]:
# Create HuggingFace datasets for training
train_dataset = Dataset.from_list([{"text": s["text"]} for s in train_data])
val_dataset = Dataset.from_list([{"text": s["text"]} for s in val_data])

print(f"✓ Train dataset: {len(train_dataset)} samples")
print(f"✓ Val dataset: {len(val_dataset)} samples")

## 5. Load Base Model

### QLoRA Quantization

QLoRA enables fine-tuning large models on consumer GPUs by:
1. **4-bit NF4 quantization**: Reduces memory from 32GB to ~6GB
2. **Double quantization**: Further compresses quantization constants
3. **Paged optimizers**: Prevents OOM during gradient computation

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

MODEL_ID = "meta-llama/Llama-3.1-8B-Instruct"

# QLoRA quantization configuration
quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",           # NormalFloat4 for better accuracy
    bnb_4bit_compute_dtype=torch.float16, # Compute in FP16
    bnb_4bit_use_double_quant=True        # Double quantization for memory
)

# Load tokenizer
print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, token=HF_TOKEN)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"  # Required for completion-only training

# Load model
print("Loading model (this may take 2-3 minutes)...")
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    quantization_config=quant_config,
    device_map="auto",
    token=HF_TOKEN,
    low_cpu_mem_usage=True,
    use_cache=False  # Disable KV cache for training
)

print("\n" + "=" * 60)
print("MODEL LOADED")
print("=" * 60)
print(f"Model: {MODEL_ID}")
print(f"Parameters: {model.num_parameters():,}")
print(f"Quantization: 4-bit NF4")
print(f"VRAM usage: ~6 GB")

## 6. LoRA Configuration

### How LoRA Works

Instead of fine-tuning all 8B parameters, LoRA:
1. Freezes original weights
2. Adds small trainable matrices (rank=64) to attention layers
3. Only trains ~0.5% of parameters

**Target Modules**:
- `q_proj, k_proj, v_proj, o_proj`: Attention mechanism
- `gate_proj, up_proj, down_proj`: MLP layers

In [None]:
from peft import LoraConfig, prepare_model_for_kbit_training

# Prepare model for k-bit training (enables gradient checkpointing)
model = prepare_model_for_kbit_training(model)

# LoRA configuration
LORA_CONFIG = LoraConfig(
    r=64,                    # Rank: higher = more capacity, more memory
    lora_alpha=16,           # Scaling factor (alpha/r = effective learning rate)
    lora_dropout=0.1,        # Dropout for regularization
    target_modules=[
        "q_proj", "k_proj", "v_proj", "o_proj",  # Attention
        "gate_proj", "up_proj", "down_proj"       # MLP
    ],
    bias="none",
    task_type="CAUSAL_LM"
)

print("=" * 60)
print("LoRA CONFIGURATION")
print("=" * 60)
print(f"Rank (r): {LORA_CONFIG.r}")
print(f"Alpha: {LORA_CONFIG.lora_alpha}")
print(f"Effective LR scale: {LORA_CONFIG.lora_alpha / LORA_CONFIG.r}")
print(f"Dropout: {LORA_CONFIG.lora_dropout}")
print(f"Target modules: Attention + MLP (7 layers)")

## 7. Training Configuration

### Critical: DataCollatorForCompletionOnlyLM

**Problem without it**: Standard training computes loss on entire text:
```
Loss = CrossEntropy("<system>...<user>...code...<assistant>VULNERABLE")
```
This causes the model to **memorize prompts** instead of learning classification.

**Solution**: Compute loss ONLY on the response (after `<|start_header_id|>assistant`):
```
Loss = CrossEntropy("VULNERABLE")  # Only classification token
```

This forces the model to **learn** rather than **memorize**.

In [None]:
from trl import SFTTrainer, SFTConfig, DataCollatorForCompletionOnlyLM

# Response template marker - loss computed only AFTER this
RESPONSE_TEMPLATE = "<|start_header_id|>assistant<|end_header_id|>"

# Data collator for completion-only training
data_collator = DataCollatorForCompletionOnlyLM(
    response_template=RESPONSE_TEMPLATE,
    tokenizer=tokenizer
)

print("=" * 60)
print("DATA COLLATOR")
print("=" * 60)
print(f"Type: DataCollatorForCompletionOnlyLM")
print(f"Response template: {RESPONSE_TEMPLATE}")
print(f"Effect: Loss computed ONLY on classification output")
print(f"Benefit: Prevents memorization, enables generalization")

In [None]:
# Training configuration
TRAINING_CONFIG = SFTConfig(
    output_dir="/kaggle/working/checkpoints",
    
    # Training duration
    num_train_epochs=3,
    
    # Batch configuration
    per_device_train_batch_size=2,
    per_device_eval_batch_size=2,
    gradient_accumulation_steps=4,  # Effective batch = 2 * 4 = 8
    
    # Optimizer (paged for memory efficiency)
    optim="paged_adamw_32bit",
    learning_rate=2e-4,
    weight_decay=0.01,
    warmup_ratio=0.03,
    lr_scheduler_type="cosine",
    
    # Mixed precision
    fp16=True,
    
    # Logging & evaluation
    logging_steps=10,
    eval_strategy="epoch",
    save_strategy="epoch",
    save_total_limit=2,
    load_best_model_at_end=True,
    
    # Sequence handling
    max_seq_length=1024,
    packing=False,  # No packing for classification
    
    # Memory optimization
    gradient_checkpointing=True,
    gradient_checkpointing_kwargs={"use_reentrant": False},
    
    # Other
    report_to="none",
    seed=42
)

print("=" * 60)
print("TRAINING CONFIGURATION")
print("=" * 60)
print(f"Epochs: {TRAINING_CONFIG.num_train_epochs}")
print(f"Batch size: {TRAINING_CONFIG.per_device_train_batch_size}")
print(f"Gradient accumulation: {TRAINING_CONFIG.gradient_accumulation_steps}")
print(f"Effective batch size: {TRAINING_CONFIG.per_device_train_batch_size * TRAINING_CONFIG.gradient_accumulation_steps}")
print(f"Learning rate: {TRAINING_CONFIG.learning_rate}")
print(f"Max sequence length: {TRAINING_CONFIG.max_seq_length}")
print(f"Warmup ratio: {TRAINING_CONFIG.warmup_ratio}")

In [None]:
# Initialize trainer
trainer = SFTTrainer(
    model=model,
    args=TRAINING_CONFIG,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    peft_config=LORA_CONFIG,      # LoRA applied here, not before
    data_collator=data_collator,  # Completion-only training
    tokenizer=tokenizer,
)

print("✓ SFTTrainer initialized")
print(f"✓ Trainable parameters: {sum(p.numel() for p in model.parameters() if p.requires_grad):,}")

## 8. Training

Training will run for 3 epochs. Expected time: 20-30 minutes on T4 GPU.

In [None]:
print("=" * 60)
print("STARTING TRAINING")
print("=" * 60)
print(f"Training samples: {len(train_dataset)}")
print(f"Validation samples: {len(val_dataset)}")
print(f"Estimated time: 20-30 minutes")
print("=" * 60)

# Clear cache before training
torch.cuda.empty_cache()

# Train
train_result = trainer.train()

print("\n" + "=" * 60)
print("TRAINING COMPLETE")
print("=" * 60)
print(f"Final training loss: {train_result.metrics['train_loss']:.4f}")

In [None]:
# Save fine-tuned model
MODEL_OUTPUT = "/kaggle/working/solana-vuln-model"
trainer.save_model(MODEL_OUTPUT)
tokenizer.save_pretrained(MODEL_OUTPUT)

print(f"✓ Model saved to: {MODEL_OUTPUT}")
print(f"✓ Contains LoRA adapter weights only (~100 MB)")

## 9. Evaluation

Evaluate on held-out test set (10% of data, never seen during training).

In [None]:
def extract_code(sample):
    """Extract code from formatted sample text."""
    text = sample['text']
    start = '<|start_header_id|>user<|end_header_id|>'
    end = '<|eot_id|><|start_header_id|>assistant'
    
    start_idx = text.find(start)
    end_idx = text.find(end)
    
    if start_idx != -1 and end_idx != -1:
        return text[start_idx + len(start):end_idx].strip()
    return text[:1000]


def predict(sample):
    """Generate prediction using fine-tuned model."""
    code = extract_code(sample)
    
    prompt = f"""<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a smart contract security analyzer.
You analyze Solana smart contracts written in Rust and identify vulnerabilities.
Classify the code as either VULNERABLE or SAFE.
Respond with only one word: VULNERABLE or SAFE.<|eot_id|><|start_header_id|>user<|end_header_id|>

{code}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

"""
    
    inputs = tokenizer(
        prompt, 
        return_tensors="pt", 
        truncation=True, 
        max_length=1024
    ).to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=10,
            do_sample=False,
            pad_token_id=tokenizer.eos_token_id
        )
    
    # Extract only new tokens
    response = tokenizer.decode(
        outputs[0][inputs['input_ids'].shape[1]:], 
        skip_special_tokens=True
    ).strip().upper()
    
    # Parse response
    first_word = response.split()[0] if response.split() else ""
    return 'VULNERABLE' if 'VULN' in first_word else 'SAFE'


print("✓ Prediction function ready")

In [None]:
print("=" * 60)
print("RUNNING EVALUATION")
print("=" * 60)

results = []
for sample in tqdm(test_data, desc="Evaluating"):
    pred = predict(sample)
    results.append({
        'vulnerability_type': sample['vulnerability_type'],
        'ground_truth': sample['label'],
        'prediction': pred,
        'correct': sample['label'] == pred
    })

print(f"\n✓ Evaluated {len(results)} test samples")

## 10. Results

Calculate and display metrics per vulnerability type and overall.

In [None]:
# Calculate metrics per vulnerability type
metrics_by_type = {}
vuln_types = sorted(set(r['vulnerability_type'] for r in results))

for vtype in vuln_types:
    type_results = [r for r in results if r['vulnerability_type'] == vtype]
    gt = [r['ground_truth'] for r in type_results]
    pred = [r['prediction'] for r in type_results]
    
    metrics_by_type[vtype] = {
        'Accuracy': round(accuracy_score(gt, pred), 2),
        'Precision': round(precision_score(gt, pred, pos_label='VULNERABLE', zero_division=0), 2),
        'Recall': round(recall_score(gt, pred, pos_label='VULNERABLE', zero_division=0), 2),
        'F1-score': round(f1_score(gt, pred, pos_label='VULNERABLE', zero_division=0), 2),
        'Count': len(type_results)
    }

# Calculate macro averages
avg_metrics = {
    'Accuracy': round(np.mean([m['Accuracy'] for m in metrics_by_type.values()]), 2),
    'Precision': round(np.mean([m['Precision'] for m in metrics_by_type.values()]), 2),
    'Recall': round(np.mean([m['Recall'] for m in metrics_by_type.values()]), 2),
    'F1-score': round(np.mean([m['F1-score'] for m in metrics_by_type.values()]), 2)
}

# Display results
print("=" * 75)
print("RESULTS: Experiment 2 - Fine-Tuning with QLoRA")
print("=" * 75)
print(f"{'Vulnerability':<15} {'Accuracy':<10} {'Precision':<10} {'Recall':<10} {'F1':<10} {'N':<5}")
print("-" * 75)

for vtype in vuln_types:
    m = metrics_by_type[vtype]
    print(f"{vtype:<15} {m['Accuracy']:<10} {m['Precision']:<10} {m['Recall']:<10} {m['F1-score']:<10} {m['Count']:<5}")

print("-" * 75)
print(f"{'AVERAGE':<15} {avg_metrics['Accuracy']:<10} {avg_metrics['Precision']:<10} {avg_metrics['Recall']:<10} {avg_metrics['F1-score']:<10}")
print("=" * 75)

## 11. Confusion Matrix

In [None]:
# Overall confusion matrix
all_gt = [r['ground_truth'] for r in results]
all_pred = [r['prediction'] for r in results]
cm = confusion_matrix(all_gt, all_pred, labels=['VULNERABLE', 'SAFE'])

# Plot
plt.figure(figsize=(8, 6))
sns.heatmap(
    cm, 
    annot=True, 
    fmt='d', 
    cmap='Greens',
    xticklabels=['VULNERABLE', 'SAFE'],
    yticklabels=['VULNERABLE', 'SAFE'],
    annot_kws={'size': 16}
)
plt.xlabel('Predicted', fontsize=12)
plt.ylabel('Actual', fontsize=12)
plt.title('Experiment 2: Fine-Tuning (QLoRA) - Confusion Matrix', fontsize=14)
plt.tight_layout()
plt.savefig('/kaggle/working/cm_fine_tuning.png', dpi=150, bbox_inches='tight')
plt.show()

print(f"\nConfusion Matrix Breakdown:")
print(f"  TP (correctly detected vulnerabilities): {cm[0,0]}")
print(f"  FN (missed vulnerabilities):             {cm[0,1]}")
print(f"  FP (false alarms):                       {cm[1,0]}")
print(f"  TN (correctly identified safe code):     {cm[1,1]}")

## 12. Save Results

In [None]:
# Save detailed results
results_df = pd.DataFrame(results)
results_df.to_csv('/kaggle/working/results_fine_tuning.csv', index=False)

# Overall metrics
overall_acc = accuracy_score(all_gt, all_pred)
overall_prec = precision_score(all_gt, all_pred, pos_label='VULNERABLE', zero_division=0)
overall_rec = recall_score(all_gt, all_pred, pos_label='VULNERABLE', zero_division=0)
overall_f1 = f1_score(all_gt, all_pred, pos_label='VULNERABLE', zero_division=0)

# Save comprehensive summary
summary = {
    'experiment': 'Fine-Tuning with QLoRA',
    'experiment_id': 2,
    'model': {
        'base': 'meta-llama/Llama-3.1-8B-Instruct',
        'quantization': '4-bit NF4 (QLoRA)',
        'parameters': '8B total, ~100M trainable'
    },
    'method': {
        'type': 'Fine-Tuning',
        'technique': 'QLoRA + SFTTrainer + DataCollatorForCompletionOnlyLM',
        'key_innovation': 'Loss computed only on classification output',
        'training_required': True
    },
    'lora_config': {
        'r': LORA_CONFIG.r,
        'alpha': LORA_CONFIG.lora_alpha,
        'dropout': LORA_CONFIG.lora_dropout,
        'target_modules': list(LORA_CONFIG.target_modules)
    },
    'training_config': {
        'epochs': TRAINING_CONFIG.num_train_epochs,
        'learning_rate': TRAINING_CONFIG.learning_rate,
        'batch_size': TRAINING_CONFIG.per_device_train_batch_size,
        'gradient_accumulation': TRAINING_CONFIG.gradient_accumulation_steps,
        'effective_batch_size': TRAINING_CONFIG.per_device_train_batch_size * TRAINING_CONFIG.gradient_accumulation_steps,
        'max_seq_length': TRAINING_CONFIG.max_seq_length,
        'warmup_ratio': TRAINING_CONFIG.warmup_ratio,
        'optimizer': 'paged_adamw_32bit'
    },
    'dataset': {
        'total': len(dataset),
        'train': len(train_data),
        'val': len(val_data),
        'test': len(test_data),
        'vulnerability_types': len(vuln_types)
    },
    'results': {
        'overall': {
            'accuracy': round(overall_acc, 4),
            'precision': round(overall_prec, 4),
            'recall': round(overall_rec, 4),
            'f1_score': round(overall_f1, 4)
        },
        'macro_average': avg_metrics,
        'per_vulnerability_type': metrics_by_type
    },
    'confusion_matrix': {
        'TP': int(cm[0,0]),
        'FN': int(cm[0,1]),
        'FP': int(cm[1,0]),
        'TN': int(cm[1,1])
    },
    'references': [
        'Dettmers et al. (2023). QLoRA: Efficient Finetuning of Quantized LLMs. arXiv:2305.14314',
        'Hu et al. (2021). LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685'
    ]
}

with open('/kaggle/working/summary_fine_tuning.json', 'w') as f:
    json.dump(summary, f, indent=2)

print("=" * 60)
print("FILES SAVED")
print("=" * 60)
print("✓ results_fine_tuning.csv     (detailed predictions)")
print("✓ summary_fine_tuning.json    (experiment summary)")
print("✓ cm_fine_tuning.png          (confusion matrix)")
print("✓ solana-vuln-model/          (LoRA adapter weights)")

## 13. Summary

In [None]:
print("\n" + "=" * 60)
print("EXPERIMENT 2 COMPLETE")
print("=" * 60)
print(f"\nMethod: Fine-Tuning with QLoRA")
print(f"Model: LLaMA-3.1-8B-Instruct (fine-tuned)")
print(f"\nKey Results:")
print(f"  Overall Accuracy: {overall_acc:.2%}")
print(f"  Overall F1-Score: {overall_f1:.2%}")
print(f"  Macro Avg Accuracy: {avg_metrics['Accuracy']}")
print(f"  Macro Avg F1-Score: {avg_metrics['F1-score']}")
print(f"\nKey Technique:")
print(f"  DataCollatorForCompletionOnlyLM ensures the model")
print(f"  learns to CLASSIFY rather than MEMORIZE patterns.")
print(f"\n" + "=" * 60)
print("IMPORTANT: Download 'solana-vuln-model/' for Experiment 3")
print("=" * 60)